🌐

SQL Window Functions

Complete Study Guide & Reference

🌐 What is a Window Function?

A Window Function in SQL performs a calculation across a set of rows that are somehow related to the current row.

  • Unlike aggregate functions (SUM, AVG, COUNT, etc.), window functions do not collapse rows into one row.
  • Instead, they return a value for each row, while still being able to look at other rows in the dataset.

📌 Why Use Window Functions?

  • To perform calculations without losing row-level details.
  • To calculate running totals, rankings, moving averages, percentiles etc.
  • To compare a row against a group of rows (partition).
  • More powerful than GROUP BY because you can keep all rows.

📌 Where to Use?

  • Business analytics reports.
  • Leaderboards (rankings).
  • Financial data analysis (running totals, moving averages).
  • Finding the first/last occurrence within groups.
  • Percentile calculations (e.g., top 10%).

📌 Syntax of a Window Function

function_name(expression) 
OVER (
    PARTITION BY column_name   -- optional (divide rows into groups)
    ORDER BY column_name       -- optional (define order inside group)
    ROWS or RANGE clause       -- optional (define frame for calculation)
)
                    

📌 Types of Window Functions

1. Aggregate Window Functions

They extend normal aggregate functions: SUM(), AVG(), MIN(), MAX(), COUNT()

👉 Example: Running Total of Salary

SELECT 
    employee_id,
    department,
    salary,
    SUM(salary) OVER (PARTITION BY department ORDER BY employee_id) AS running_total
FROM employees;
                        

💡 Here, salaries are summed department-wise, but each row is preserved.

2. Ranking Functions

  • ROW_NUMBER() → gives unique sequence number.
  • RANK() → gives rank with gaps if duplicates.
  • DENSE_RANK() → gives rank without gaps.
  • NTILE(n) → divides rows into n buckets.

👉 Example: Top 3 salaries in each department

SELECT 
    employee_id,
    department,
    salary,
    RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank_in_dept
FROM employees
WHERE RANK() OVER (PARTITION BY department ORDER BY salary DESC) <= 3;
                        

3. Value Functions

  • FIRST_VALUE(expr) → first row in window.
  • LAST_VALUE(expr) → last row in window.
  • LAG(expr, offset) → previous row value.
  • LEAD(expr, offset) → next row value.

👉 Example: Compare employee salary with previous employee

SELECT 
    employee_id,
    salary,
    LAG(salary, 1) OVER (ORDER BY employee_id) AS prev_salary,
    LEAD(salary, 1) OVER (ORDER BY employee_id) AS next_salary
FROM employees;
                        

4. Analytic Functions

  • PERCENT_RANK() → relative rank between 0 and 1.
  • CUME_DIST() → cumulative distribution.
  • NTILE(n) → divide rows into n parts.

👉 Example: Cumulative distribution of salaries

SELECT 
    employee_id,
    salary,
    CUME_DIST() OVER (ORDER BY salary) AS salary_distribution
FROM employees;
                        

📌 Difference Between GROUP BY and Window Functions

Feature GROUP BY Window Function
Rows returned Collapses rows Keeps all rows
Usage Aggregation only Aggregation + Ranking + Comparisons
Example Total salary per department Running total, rank per department

📌 Real-Life Example

Suppose we have a table sales:

id emp_name region sales_amount
1AEast5000
2BEast3000
3CWest7000
4DWest4000
5EEast6000

Query: Show each employee's sales, department total, and running total

SELECT
    emp_name,
    region,
    sales_amount,
    SUM(sales_amount) OVER (PARTITION BY region) AS total_sales_region,
    SUM(sales_amount) OVER (PARTITION BY region ORDER BY sales_amount) AS running_sales
FROM sales;
                    

👉 Output:

emp_name region sales_amount total_sales_region running_sales
BEast3000140003000
AEast5000140008000
EEast60001400014000
DWest4000110004000
CWest70001100011000

✅ Summary

  • Window functions = aggregate + row-level detail.
  • Use for running totals, ranks, comparisons, distributions.
  • Syntax: function() OVER (PARTITION BY ... ORDER BY ... ROWS ...).