Skip to content

Window Functions

Compute rankings, running totals, and row-by-row comparisons without collapsing rows.

GROUP BY collapses rows into summaries. But what if you want to rank each student within their course, or compare each row to the previous one, or calculate a running total, all while keeping every row visible? Window functions do exactly that. They compute across a set of rows related to the current row, without grouping them away.

Definition

A window function computes over a set of rows related to the current row without collapsing them. Unlike GROUP BY, window functions return one row per input row. Common window functions include ROW_NUMBER(), RANK(), and LAG().

Required step: If you do not have these tables yet, run this setup in the playground before continuing. It creates the tables used by every example in this stage.

CREATE DATABASE IF NOT EXISTS school;
USE school;

CREATE TABLE IF NOT EXISTS students (
  id INT AUTO_INCREMENT PRIMARY KEY,
  name VARCHAR(100) NOT NULL
);

CREATE TABLE IF NOT EXISTS enrollments (
  id INT AUTO_INCREMENT PRIMARY KEY,
  student_id INT NOT NULL,
  course_name VARCHAR(100) NOT NULL,
  score DECIMAL(5,2),
  FOREIGN KEY (student_id) REFERENCES students(id)
);

INSERT INTO students (name) VALUES ('Alice'), ('Bob'), ('Charlie');

INSERT INTO enrollments (student_id, course_name, score) VALUES
  (1, 'Math', 90), (1, 'Science', 85), (2, 'Math', 78);

ROW_NUMBER, RANK, DENSE_RANK

These functions assign a number to each row based on its position or ordering. They differ in how they handle ties (rows with the same value).

ROW_NUMBER: unique sequential number

SELECT name, course_name, score,
  ROW_NUMBER() OVER (ORDER BY score DESC) AS row_num
FROM students
JOIN enrollments ON students.id = enrollments.student_id;

Assigns a unique number to each row, ordered by score descending. If two students have the same score, they still get different numbers (1, 2, 3, ...).

RANK: same rank for ties, gaps after

SELECT name, course_name, score,
  RANK() OVER (ORDER BY score DESC) AS rank
FROM students
JOIN enrollments ON students.id = enrollments.student_id;

If two students tie for score, they both get rank 1. The next student gets rank 3 (not 2). Gaps appear after ties.

DENSE_RANK: same rank for ties, no gaps

SELECT name, course_name, score,
  DENSE_RANK() OVER (ORDER BY score DESC) AS dense_rank
FROM students
JOIN enrollments ON students.id = enrollments.student_id;

Like RANK, but no gaps. Two students tie at rank 1, next student gets rank 2. Better when you want consecutive ranks.

When to use each:ROW_NUMBER vs RANK vs DENSE_RANK

ROW_NUMBER: When you need a unique identifier for each row, even if values tie. Good for pagination or de-duplication.

RANK: When ties should share the same rank and you want gaps (1, 1, 3).

DENSE_RANK: When ties should share the same rank with no gaps (1, 1, 2). Good for leaderboards.

PARTITION BY

PARTITION BY divides the rows into groups (partitions) so the window function restarts at each group. This is like GROUP BY but keeps all rows visible.

Rank within each course

SELECT name, course_name, score,
  RANK() OVER (
    PARTITION BY course_name
    ORDER BY score DESC
  ) AS course_rank
FROM students
JOIN enrollments ON students.id = enrollments.student_id;

Ranks students within each course separately. Math rankings restart from 1 for Science rankings. PARTITION BY defines the groups.

LAG and LEAD

LAG accesses a value from a previous row. LEAD accesses a value from a next row. They are essential for comparing each row to its neighbors.

Compare to previous row

SELECT name, course_name, score,
  LAG(score, 1) OVER (ORDER BY score DESC) AS prev_score,
  score - LAG(score, 1) OVER (ORDER BY score DESC) AS diff
FROM students
JOIN enrollments ON students.id = enrollments.student_id;

LAG(score, 1) gets the score from the previous row (1 row back). The diff column shows the gap between consecutive scores.

Compare to next row

SELECT name, course_name, score,
  LEAD(score, 1) OVER (ORDER BY score DESC) AS next_score
FROM students
JOIN enrollments ON students.id = enrollments.student_id;

LEAD(score, 1) gets the score from the next row (1 row ahead). Useful for finding gaps or comparing to the next value.

Aggregate Window Functions

You can use aggregate functions (SUM, AVG, COUNT) as window functions with OVER(). They compute the aggregate across the window without collapsing rows.

Running total

SELECT name, course_name, score,
  SUM(score) OVER (ORDER BY score DESC
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  ) AS running_total
FROM students
JOIN enrollments ON students.id = enrollments.student_id;

SUM with OVER computes a running total. ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW means 'from the first row up to this row'.

Average per course alongside each row

SELECT name, course_name, score,
  AVG(score) OVER (PARTITION BY course_name) AS course_avg
FROM students
JOIN enrollments ON students.id = enrollments.student_id;

Shows each student's score alongside their course average. The average repeats for every row in the course. No GROUP BY needed.

What does ROW_NUMBER() assign to tied rows?

What is the difference between RANK and DENSE_RANK?

What does PARTITION BY do in a window function?

Key Takeaways

  • Window functions compute across related rows without collapsing them.
  • ROW_NUMBER gives unique numbers. RANK has gaps after ties. DENSE_RANK has no gaps.
  • PARTITION BY restarts the calculation for each group (like GROUP BY but keeps all rows).
  • LAG/LEAD access previous/next row values for row-by-row comparisons.
  • Aggregate window functions (SUM OVER, AVG OVER) compute summaries alongside each row.

Ready to test your knowledge?

Take a Quiz