🌐 What is a Window Function?
A Window Function in SQL performs a calculation across a set of rows that are somehow related to the current row.
- Unlike aggregate functions (SUM, AVG, COUNT, etc.), window functions do not collapse rows into one row.
- Instead, they return a value for each row, while still being able to look at other rows in the dataset.
📌 Why Use Window Functions?
- To perform calculations without losing row-level details.
- To calculate running totals, rankings, moving averages, percentiles etc.
- To compare a row against a group of rows (partition).
- More powerful than GROUP BY because you can keep all rows.
📌 Where to Use?
- Business analytics reports.
- Leaderboards (rankings).
- Financial data analysis (running totals, moving averages).
- Finding the first/last occurrence within groups.
- Percentile calculations (e.g., top 10%).
📌 Syntax of a Window Function
function_name(expression) OVER ( PARTITION BY column_name -- optional (divide rows into groups) ORDER BY column_name -- optional (define order inside group) ROWS or RANGE clause -- optional (define frame for calculation) )
📌 Types of Window Functions
1. Aggregate Window Functions
They extend normal aggregate functions: SUM()
, AVG()
, MIN()
, MAX()
, COUNT()
👉 Example: Running Total of Salary
SELECT employee_id, department, salary, SUM(salary) OVER (PARTITION BY department ORDER BY employee_id) AS running_total FROM employees;
💡 Here, salaries are summed department-wise, but each row is preserved.
2. Ranking Functions
ROW_NUMBER()
→ gives unique sequence number.RANK()
→ gives rank with gaps if duplicates.DENSE_RANK()
→ gives rank without gaps.NTILE(n)
→ divides rows inton
buckets.
👉 Example: Top 3 salaries in each department
SELECT employee_id, department, salary, RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank_in_dept FROM employees WHERE RANK() OVER (PARTITION BY department ORDER BY salary DESC) <= 3;
3. Value Functions
FIRST_VALUE(expr)
→ first row in window.LAST_VALUE(expr)
→ last row in window.LAG(expr, offset)
→ previous row value.LEAD(expr, offset)
→ next row value.
👉 Example: Compare employee salary with previous employee
SELECT employee_id, salary, LAG(salary, 1) OVER (ORDER BY employee_id) AS prev_salary, LEAD(salary, 1) OVER (ORDER BY employee_id) AS next_salary FROM employees;
4. Analytic Functions
PERCENT_RANK()
→ relative rank between 0 and 1.CUME_DIST()
→ cumulative distribution.NTILE(n)
→ divide rows into n parts.
👉 Example: Cumulative distribution of salaries
SELECT employee_id, salary, CUME_DIST() OVER (ORDER BY salary) AS salary_distribution FROM employees;
📌 Difference Between GROUP BY and Window Functions
Feature | GROUP BY | Window Function |
---|---|---|
Rows returned | Collapses rows | Keeps all rows |
Usage | Aggregation only | Aggregation + Ranking + Comparisons |
Example | Total salary per department | Running total, rank per department |
📌 Real-Life Example
Suppose we have a table sales
:
id | emp_name | region | sales_amount |
---|---|---|---|
1 | A | East | 5000 |
2 | B | East | 3000 |
3 | C | West | 7000 |
4 | D | West | 4000 |
5 | E | East | 6000 |
Query: Show each employee's sales, department total, and running total
SELECT emp_name, region, sales_amount, SUM(sales_amount) OVER (PARTITION BY region) AS total_sales_region, SUM(sales_amount) OVER (PARTITION BY region ORDER BY sales_amount) AS running_sales FROM sales;
👉 Output:
emp_name | region | sales_amount | total_sales_region | running_sales |
---|---|---|---|---|
B | East | 3000 | 14000 | 3000 |
A | East | 5000 | 14000 | 8000 |
E | East | 6000 | 14000 | 14000 |
D | West | 4000 | 11000 | 4000 |
C | West | 7000 | 11000 | 11000 |
✅ Summary
- Window functions = aggregate + row-level detail.
- Use for running totals, ranks, comparisons, distributions.
- Syntax:
function() OVER (PARTITION BY ... ORDER BY ... ROWS ...)
.