SQL Window Functions

🌐 What is a Window Function?

A Window Function in SQL performs a calculation across a set of rows that are somehow related to the current row.

Unlike aggregate functions (SUM, AVG, COUNT, etc.), window functions do not collapse rows into one row.
Instead, they return a value for each row, while still being able to look at other rows in the dataset.

📌 Why Use Window Functions?

To perform calculations without losing row-level details.
To calculate running totals, rankings, moving averages, percentiles etc.
To compare a row against a group of rows (partition).
More powerful than GROUP BY because you can keep all rows.

📌 Where to Use?

Business analytics reports.
Leaderboards (rankings).
Financial data analysis (running totals, moving averages).
Finding the first/last occurrence within groups.
Percentile calculations (e.g., top 10%).

📌 Syntax of a Window Function

function_name(expression) 
OVER (
    PARTITION BY column_name   -- optional (divide rows into groups)
    ORDER BY column_name       -- optional (define order inside group)
    ROWS or RANGE clause       -- optional (define frame for calculation)
)

📌 Types of Window Functions

1. Aggregate Window Functions

They extend normal aggregate functions: SUM(), AVG(), MIN(), MAX(), COUNT()

👉 Example: Running Total of Salary

SELECT 
    employee_id,
    department,
    salary,
    SUM(salary) OVER (PARTITION BY department ORDER BY employee_id) AS running_total
FROM employees;

💡 Here, salaries are summed department-wise, but each row is preserved.

2. Ranking Functions

ROW_NUMBER() → gives unique sequence number.
RANK() → gives rank with gaps if duplicates.
DENSE_RANK() → gives rank without gaps.
NTILE(n) → divides rows into n buckets.

👉 Example: Top 3 salaries in each department

SELECT 
    employee_id,
    department,
    salary,
    RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank_in_dept
FROM employees
WHERE RANK() OVER (PARTITION BY department ORDER BY salary DESC) <= 3;

3. Value Functions

FIRST_VALUE(expr) → first row in window.
LAST_VALUE(expr) → last row in window.
LAG(expr, offset) → previous row value.
LEAD(expr, offset) → next row value.

👉 Example: Compare employee salary with previous employee

SELECT 
    employee_id,
    salary,
    LAG(salary, 1) OVER (ORDER BY employee_id) AS prev_salary,
    LEAD(salary, 1) OVER (ORDER BY employee_id) AS next_salary
FROM employees;

4. Analytic Functions

PERCENT_RANK() → relative rank between 0 and 1.
CUME_DIST() → cumulative distribution.
NTILE(n) → divide rows into n parts.

👉 Example: Cumulative distribution of salaries

SELECT 
    employee_id,
    salary,
    CUME_DIST() OVER (ORDER BY salary) AS salary_distribution
FROM employees;

📌 Difference Between GROUP BY and Window Functions

Feature	GROUP BY	Window Function
Rows returned	Collapses rows	Keeps all rows
Usage	Aggregation only	Aggregation + Ranking + Comparisons
Example	Total salary per department	Running total, rank per department

📌 Real-Life Example

Suppose we have a table sales:

id	emp_name	region	sales_amount
1	A	East	5000
2	B	East	3000
3	C	West	7000
4	D	West	4000
5	E	East	6000

Query: Show each employee's sales, department total, and running total

SELECT
    emp_name,
    region,
    sales_amount,
    SUM(sales_amount) OVER (PARTITION BY region) AS total_sales_region,
    SUM(sales_amount) OVER (PARTITION BY region ORDER BY sales_amount) AS running_sales
FROM sales;

👉 Output:

emp_name	region	sales_amount	total_sales_region	running_sales
B	East	3000	14000	3000
A	East	5000	14000	8000
E	East	6000	14000	14000
D	West	4000	11000	4000
C	West	7000	11000	11000