📚 Database Indexing - Complete Interview Notes

1. What is Database Indexing?

Definition: Indexing is used to quickly retrieve particular data from the database. It is a technique that uses data structures to optimize the searching time of a database query in DBMS. Indexing reduces the number of disks required to access particular data by internally creating an index table.

Index Structure

Index Table Structure

Index Structure

Search Key: Contains copy of Primary Key or Candidate Key (stored in sorted manner)

Data Reference: Set of pointers holding address of disk block containing actual data

3. Types of Indexes

Types of Indexes

Single Level Indexing

Single Level Indexing is like index found in a book. Contains topic names along with page numbers. Similarly, database index contains keys and their corresponding block addresses.

A) Primary Indexing (primary key and Ordered data)

Definition: Index table created using Primary keys. Defined on ordered data.

Example: If student table is sorted by student_id (primary key), then primary index will contain student_id and corresponding block addresses.
Primary Index Structure
Characteristics of Primary Indexing:

B) Secondary Indexing

Definition: Two-level indexing technique used to reduce mapping size of primary index. Points to location where data is found but actual data is not sorted. Also known as non-clustered indexing.

Secondary Index Structure

Secondary Index Structure
Characteristics of Secondary Indexing:

C) Cluster Indexing (ordered but non key)

Definition: Used when multiple related records are found at one place. Defined on ordered data. Index created using non-key values which may or may not be unique. Groups columns having similar characteristics.

Cluster Index Structure

Cluster Index Structure
Characteristics of Clustered Indexing:

Ordered Indexing

Ordered indexing is traditional way of storing that gives fast retrieval. Indices are stored in sorted manner, hence also known as ordered indices.

A) Dense Indexing

Definition: Index table contains records for every search key value of the database. Makes searching faster but requires more space.

Dense Index Example

Dense Index Structure

Every search key has corresponding index entry

B) Sparse Indexing

Definition: Consumes lesser space than dense indexing but is slower. Does not include search key for every record. Store search key that points to a block containing group of data.

Sparse Index Example

Sparse Index Structure

Selected search keys point to blocks containing multiple records

Key Difference: Dense indexing has entry for every record, Sparse indexing has entry for selected records only.

4. Multi-Level Indexing

When a database is very large, the index table itself becomes large. Storing such a big index in memory takes a lot of space.
Multi-Level Indexing solves this problem by creating a hierarchy of indexes:

Multi-Level Index Structure

Multi-Level Index Structure
B+ Tree for Multi-Level Indexing:

5. B-Tree vs B+ Tree (Simple and Clear Explanation)

b and b+ tree

What is a B-Tree?

B-Tree is a tree-like structure used to store and search data efficiently.

How it works: Limitation: Range queries and sequential access (finding next item) is slow because leaf nodes are not connected.

What is a B+ Tree?

B+ Tree is an improved version of B-Tree, used in most databases.

How it works: Advantages:

B-Tree vs B+ Tree Comparison

Feature B-Tree B+ Tree
Data Storage All nodes (internal + leaf) store data Only leaf nodes store data
Leaf Node Connection No connection Connected by linked list
Range Queries Slower and complex Faster and simple
Sequential Access Difficult Easy (just follow the linked list)
Memory Usage Less More (some keys are repeated)
Used In File systems Databases
Why B+ Tree is preferred in Databases:

6. Advantages of Indexing

7. Summary Points for Interview

Key Takeaways:
Remember: Most commonly used indexing attributes are B-trees, Bitmap, Ascending/Descending, Column/Functional, Single/Concatenated, and Partitioned/Non-partitioned.