๐Ÿ—„๏ธ DBMS Concurrency Control & Serialization

๐Ÿ”„1. Serialization in DBMS

Serialization ensures that the outcome of executing concurrent transactions is the same as if the transactions were executed one after another (serially).

๐ŸŽฏ Goal: Avoid problems like data inconsistency, dirty reads, lost updates, phantom reads, and unrepeatable reads.

๐Ÿง  Example: Banking Transaction

Consider two transactions on Account A (initially โ‚น1000):

  • T1: Deposit โ‚น500 into Account A
  • T2: Multiply Account A by 2

Serial Execution (Correct):

Serial Execution Example
T1 โ†’ T2 execution:
Step 1: T1 reads A = โ‚น1000
Step 2: T1 writes A = โ‚น1000 + โ‚น500 = โ‚น1500
Step 3: T2 reads A = โ‚น1500
Step 4: T2 writes A = โ‚น1500 ร— 2 = โ‚น3000
Final Result: โ‚น3000 โœ…

Non-Serial Execution (Incorrect):

Concurrent Execution Example
Concurrent execution problem:
Step 1: T1 reads A = โ‚น1000
Step 2: T2 reads A = โ‚น1000 (before T1 commits)
Step 3: T1 writes A = โ‚น1500
Step 4: T2 writes A = โ‚น2000 (overwrites T1's work)
Final Result: โ‚น2000 โŒ (Lost T1's deposit!)
๐Ÿ” Key Insight: We want concurrent schedules to behave like serial schedules - this is Serialization.

โš”๏ธ 2. Conflict Equivalence & Conflict Serializability (Hinglish Explanation)

โœ… Conflict Equivalent Schedules

Do schedules tab conflict equivalent hote hain jab:

  • Unme same set of transactions hote hain
  • Jo operations conflict karte hain unka order dono mein same hota hai

๐Ÿงฎ Conflicting Operations kya hoti hain?

Do operations tab conflict karti hain jab:

  • Alag-alag transactions se ho
  • Same data item ko access kar rahi ho
  • Aur kam se kam ek write operation ho
Operation 1 Operation 2 Conflict? Reason
Read(A) Read(A) No Sirf reading, koi conflict nahi
Read(A) Write(A) Yes Read-Write conflict
Write(A) Read(A) Yes Write-Read conflict
Write(A) Write(A) Yes Write-Write conflict

๐Ÿ”„ Conflict Serializability

Agar ek schedule ko sirf non-conflicting operations ko swap karke ek serial schedule mein badla ja sake, toh woh conflict serializable kehlata hai.

๐Ÿงช Example: Schedule S

Time:      1     2     3     4     5     6     7     8
T1:       R(A)  W(A)         R(B)  W(B)
T2:             R(A)  W(A)         R(B)  W(B)
        

Conflicting operations:

  • T1.W(A) โ†’ T2.R(A): Write-Read conflict
  • T1.W(A) โ†’ T2.W(A): Write-Write conflict
  • T1.R(B) โ†’ T2.W(B): Read-Write conflict
  • T1.W(B) โ†’ T2.R(B): Write-Read conflict
  • T1.W(B) โ†’ T2.W(B): Write-Write conflict

๐Ÿ“Š Precedence Graph (T1 โ†’ T2)

Nodes: T1, T2
Edges: T1 โ†’ T2 (multiple conflicts)

โœ… No cycle โ†’ Conflict Serializable
โœ… Equivalent to serial order: T1 โ†’ T2
        

๐Ÿง  Yaad Rakho (Tips)

  • Conflict = alag transaction + same data + ek write
  • Precedence graph banao: nodes = transactions
  • Cycle nahi = Serializable โœ…
  • Cycle hai = Not Serializable โŒ

๐Ÿ‘๏ธ View Serializability (Hinglish)

๐Ÿ“Œ Definition:

Agar ek schedule ka result same ho jaise kisi serial schedule ka hota โ€” including what values transactions read/write โ€” toh woh view serializable hota hai.

๐Ÿ” 3 Main Conditions Check Karte Hain:

  • Initial Read Same: Start mein sab same data padhein
  • Read From Same Transaction: Jo write kisi transaction ne kiya, wohi read ho
  • Final Write Same: End mein same transaction ne data write kiya ho

๐Ÿ“š Difference between Conflict & View Serializability:

Feature Conflict Serializability View Serializability
Check karna Easy (graph se) Difficult (manually)
Strictness More strict Less strict
Allowable Schedules Kam Zyada
Example usage Mostly practical systems More theoretical

๐Ÿ‘“ Simple Tip:

  • Conflict Serializable: Graph banao, cycle nahi to โœ…
  • View Serializable: Agar reads & writes ka result same ho
  • Har Conflict Serializable bhi View Serializable hota hai โ€” but reverse zaroori nahi

๐Ÿ”3. Locking Protocols

Locking Protocols concurrency control ke liye use kiye jaate hain, jisse transactions safe aur isolated rehte hain.

๐Ÿ”„ Shared (S) and Exclusive (X) Locks

Database me data access karne ke liye do tarah ke locks hote hain:

  • Shared Lock (S): Sirf read ke liye. Multiple transactions ek saath Shared Lock le sakte hain.
  • Exclusive Lock (X): Read + Write dono ke liye. Jab tak Exclusive Lock hai, koi aur transaction access nahi kar sakta.
Lock Type Purpose Can Read? Can Write? Others Can Read? Others Can Write?
Shared (S) Reading โœ… โŒ โœ… โŒ
Exclusive (X) Reading + Writing โœ… โœ… โŒ โŒ

๐Ÿ“Œ Lock Compatibility Matrix

Yeh matrix batata hai ki agar ek lock laga hua hai, to doosra lock lag sakta hai ya nahi:

Lock Compatibility Matrix
  • Agar Shared lock laga ho, toh ek aur Shared lock lag sakta hai (โœ…).
  • Agar Exclusive lock laga ho, toh koi bhi naya lock nahi lag sakta (โŒ).
  • Exclusive access tabhi milega jab koi aur transaction Q par access na kar raha ho.

๐Ÿ”„ Locking Protocol Example

Neeche diye gaye diagram me do transactions hain:

Transaction Lock Example
  • T32 do baar Shared Lock (S) le raha hai aur sirf read kar raha hai.
  • T33 Exclusive Lock (X) le raha hai, jisse read + write dono allowed hai.
  • Agar T32 ne pehle Shared Lock le liya ho, toh T33 ko Exclusive Lock tab tak nahi milega jab tak T32 unlock nahi karta.

โš ๏ธ Problems with Basic Locking

  • Deadlock: Dono transactions ek dusre ka wait karte rahte hain.
  • Starvation: Ek transaction ko baar-baar ignore kiya jaata hai.
  • Cascading Rollback: Ek transaction fail ho gaya, jiska data kisi ne read kiya ho, toh sab rollback karne padte hain.
  • Inconsistent Analysis: Jab read ho raha ho aur data simultaneously update bhi ho raha ho.

๐Ÿ”’4. Two-Phase Locking (2PL)

Two-Phase Locking ek aisa protocol hai jo conflict serializability ensure karta hai by splitting transaction locking activity into two phases:

๐Ÿ”‘ 2PL Rule: Ek baar agar transaction ne koi lock release kar diya, toh uske baad woh naya lock acquire nahi kar sakta.

๐Ÿ“ˆ Growing Phase

Transaction naya lock le sakta hai (Shared ya Exclusive), lekin release nahi kar sakta.

๐Ÿ“‰ Shrinking Phase

Transaction lock release kar sakta hai, lekin naya lock acquire nahi kar sakta.

๐Ÿง  Example: 2PL in Action

Transaction T1 following 2PL:
๐Ÿ“ˆ Growing Phase:
  - Lock-S(A)
  - Read(A)
  - Lock-X(B)
  - Write(B)

๐Ÿ“‰ Shrinking Phase:
  - Unlock(A)
  - Unlock(B)
  - โŒ Cannot acquire any new lock now!
    

โš ๏ธ Real Problems in Locking

๐Ÿ” Deadlock Example:

T1: Lock-X(A)
T2: Lock-X(B)
T1: Wait for B  ๐Ÿ”’
T2: Wait for A  ๐Ÿ”’
// Deadlock! T1 and T2 are waiting for each other forever
  

Explanation: T1 ne A lock kiya, T2 ne B lock kiya. Ab dono ek dusre ka wait kar rahe hain, jo kabhi release nahi hoga.

๐Ÿšซ Starvation Example:

T1: Lock-X(Q)
T1: Long transaction... (hold X lock for long time)
Meanwhile:
T2: Wants Lock-S(Q)
T3: Wants Lock-S(Q)
T4: Wants Lock-S(Q)
// All waiting again and again due to T1
  

Explanation: T1 bahut time tak Exclusive lock rakhta hai. Is wajah se baaki sab (T2, T3, T4) wait karte rehte hain โ€” yeh hai starvation.

๐Ÿ’ฅ Irrecoverability Example:

T1: Write(A)
T2: Read(A)      // โš  Uncommitted data from T1
T1: Crash ๐Ÿ’€
// Now T2 has used invalid data
  

Explanation: T2 ne T1 ke write kiya hua data use kiya, lekin T1 crash ho gaya. Ab T2 ka data bhi galat ho gaya. Yeh irrecoverable situation hai.

๐ŸŒŠ Cascading Rollback Example:

T1: Write(A)
T2: Read(A)
T3: Read(A)

T1: โŒ Abort (Crash or Rollback)
// T2 and T3 must also Rollback!
  

Explanation: T2 aur T3 ne T1 ke data ko read kiya, jo ab rollback ho gaya. Ab T2 aur T3 bhi rollback karne padenge โ€” isey bolte hain cascading rollback.

โœ… Advantages of 2PL:

  • Guarantees conflict serializability
  • Prevents lost updates and dirty reads
  • Widely used in real-world DBMS like MySQL, Oracle

โŒ Limitations of 2PL:

  • Deadlocks: Still possible due to lock waits
  • Cascading Rollbacks: Happen if one transaction reads uncommitted data
  • Starvation: Long-running transactions can block others
  • Reduced Concurrency: Overall system slow ho sakta hai

๐Ÿ›ก๏ธ5. Strict Two-Phase Locking (Strict 2PL)

Strict 2PL is an enhanced version of 2PL that solves the cascading rollback problem.

๐Ÿ”’ Strict 2PL Rule: A transaction must not release any exclusive (write) locks until it commits or aborts.

๐Ÿง  Example: Strict 2PL vs Regular 2PL

Regular 2PL:

T1: Lock-X(A), Write(A), Unlock(A), ... , Commit
T2:                      Lock-S(A), Read(A), ...
    // T2 might read uncommitted data if T1 aborts later

Strict 2PL:

T1: Lock-X(A), Write(A), ..., Commit, Unlock(A)
T2:                                   Lock-S(A), Read(A)
    // T2 can only read after T1 commits

โœ… Advantages of Strict 2PL:

  • Prevents cascading rollbacks
  • Ensures recoverability
  • Simplifies recovery procedures
  • Guarantees both conflict serializability and recoverability

โš ๏ธ Trade-offs:

  • Increased blocking time (locks held longer)
  • Potential for more deadlocks
  • Reduced concurrency compared to regular 2PL

โฐ6. Timestamp Ordering Protocol

Timestamp Ordering ek aisa protocol hai jisme locking ka use nahi hota. Har transaction ko ek unique timestamp milta hai, aur execution isi order me hona chahiye.

๐Ÿงญ Basic Idea: Har transaction ko start hone par ek unique timestamp milta hai. Fir har operation (read/write) ke decision isi timestamp ke base par liye jaate hain.

๐Ÿ“‹ How It Works:

Har data item X ke saath 2 timestamps maintain hote hain:

  • R-timestamp(X): Sabse latest transaction jiske ne X ko read kiya.
  • W-timestamp(X): Sabse latest transaction jiske ne X ko write kiya.

๐Ÿ”„ Protocol Rules:

๐Ÿ“– For Read Operation by Ti:

if (TS(Ti) < W-timestamp(X)) {
    // Ti purana hai aur wo kisi naye transaction ke write ko read kar raha hai
    ABORT Ti;
} else {
    // Read allow karo
    R-timestamp(X) = max(R-timestamp(X), TS(Ti));
}
    

โœ๏ธ For Write Operation by Ti:

if (TS(Ti) < R-timestamp(X) || TS(Ti) < W-timestamp(X)) {
    // Ti purana hai aur wo kisi naye transaction ke read/write data ko overwrite kar raha hai
    ABORT Ti;
} else {
    // Write allow karo
    W-timestamp(X) = TS(Ti);
}
    

๐Ÿง  Example: Timestamp Ordering

Assume karte hain 3 transactions hain with timestamps:

T1 (TS = 5)
T2 (TS = 10)
T3 (TS = 15)

Initially:
R-timestamp(A) = 0
W-timestamp(A) = 0
    
๐Ÿ”น Step 1: T2 wants to Read(A)
    TS(T2) = 10 > W-timestamp(A) = 0 โœ…
    โžค R-timestamp(A) = 10

๐Ÿ”น Step 2: T1 wants to Write(A)
    TS(T1) = 5 < R-timestamp(A) = 10 โŒ
    โžค ABORT T1 (older T1 trying to overwrite data read by newer T2)

๐Ÿ”น Step 3: T3 wants to Write(A)
    TS(T3) = 15 > R-timestamp(A) = 10 โœ…
    โžค W-timestamp(A) = 15
    

Conclusion: Timestamp protocol ensure karta hai ki execution order conflict-serializable ho, lekin purane transactions ko abort kar diya jaata hai agar woh naya data access karna chahein.

โœ… Advantages:

  • ๐Ÿ’€ No deadlocks (koi wait nahi karta โ€” directly abort hota hai)
  • ๐Ÿ”“ No lock management needed
  • โœ… Conflict serializability ensure hoti hai

โŒ Disadvantages:

  • High Abort Rate: Purane transactions baar-baar abort ho sakte hain
  • Starvation: Older transactions may never finish
  • Overhead: Har data item ke liye timestamps maintain karna padta hai
  • Cascading Rollbacks: Agar abort hua, toh doosre dependent transactions bhi rollback ho sakte hain

๐Ÿ“‹7. Summary & Comparison

๐Ÿ› ๏ธ Protocol ๐ŸŽฏ Purpose โœ… Advantages โš ๏ธ Disadvantages ๐Ÿ“Œ Use Case
Basic Locking Control access to data Simple and flexible Prone to deadlocks and cascading rollbacks Small/simple apps with limited concurrency
Two-Phase Locking (2PL) Ensure serializability Guarantees conflict serializability Can cause deadlocks and cascading rollbacks Used in most commercial DBMS (MySQL, Oracle)
Strict 2PL Ensure recoverability Prevents cascading rollbacks Reduces concurrency Systems needing safe recovery (e.g., banking)
Timestamp Ordering Order transactions without locks No deadlocks, no locking required High abort rate, starvation possible Read-heavy or low-write environments
๐ŸŽฏ Key Takeaways:
  • โœ”๏ธ Serialization ensures correctness during concurrent transactions.
  • โœ”๏ธ Conflict Serializability is practically used to check schedule safety.
  • ๐Ÿ” Locking Protocols reduce concurrency to maintain consistency.
  • ๐Ÿ’ก Strict 2PL is the most common strategy in real-world systems.
  • โฐ Timestamp Ordering offers a lock-free solution but at a cost.