Deadlocks
A deadlock is a situation where two or more threads are waiting indefinitely for resources held by each other. Understanding deadlocks is essential for senior engineers — they can bring entire systems down.Interview Frequency: High
Key Topics: Four conditions, Banker’s algorithm, prevention strategies
Time to Master: 6-8 hours
Key Topics: Four conditions, Banker’s algorithm, prevention strategies
Time to Master: 6-8 hours
Deadlock Fundamentals
Classic Deadlock Scenario
The Four Necessary Conditions
1. Mutual Exclusion
Resources can only be held by one thread at a time.Example: A mutex can only be held by one thread.
2. Hold and Wait
Thread holds resources while waiting for others.Example: Thread holds mutex1 while waiting for mutex2.
3. No Preemption
Resources can only be released voluntarily.Example: Can’t force a thread to release its mutex.
4. Circular Wait
Circular chain of threads waiting for each other.Example: A waits for B, B waits for A.
Resource Allocation Graph
Visualize resource allocation to detect deadlocks:- Single instance per resource: Cycle = Deadlock
- Multiple instances: Cycle is necessary but not sufficient
Deadlock Prevention
Break one of the four conditions:1. Break Mutual Exclusion
2. Break Hold and Wait
3. Break No Preemption
4. Break Circular Wait (Lock Ordering)
Deadlock Avoidance
Allow the system to deny requests that COULD lead to deadlock.Safe State
A state is safe if there exists a sequence where all processes can complete:Banker’s Algorithm
Deadlock Detection
Instead of preventing/avoiding, detect and recover:Detection Algorithm
When to Run Detection
| Strategy | Frequency | Trade-off |
|---|---|---|
| Every request | Immediate detection | High overhead |
| Periodic (every N seconds) | Balance | May delay detection |
| When CPU utilization drops | Triggered | Resource dependent |
Deadlock Recovery
Once detected, how to break the deadlock:1. Process Termination
2. Resource Preemption
- Rolling back may lose work
- Same process may be selected repeatedly (starvation)
- Need checkpointing for rollback
Locking Design Playbook
This section is a practical checklist for designing locking in a new subsystem so you avoid deadlocks up front.Step 1: Identify Resources and Contention Points
- List all shared data structures and external resources (sockets, files, devices) that multiple threads touch.
- For each, decide:
- Is it mostly read or write heavy?
- How large is the critical section?
- Is ordering important (e.g., parent → child, global → local)?
Step 2: Define a Global Lock Ordering
- Assign each lock a total order (e.g.,
GLOBAL<DB<CACHE<LOG). - Enforce the rule: always acquire locks in increasing order.
- Document this in comments and design docs so future contributors follow it.
Step 3: Choose the Right Primitive
- Coarse mutex for simple systems with low contention.
- Fine-grained locks where contention hotspots are identified.
- Reader-writer locks for read-mostly data.
- Lock-free / RCU only when the performance benefit justifies the complexity.
Step 4: Add Timeouts and Diagnostics
- Prefer
trylock+ backoff in non-critical paths where waiting forever is unacceptable. - Add lock acquisition logging at debug level for tricky subsystems.
- In the kernel, enable tools like lockdep; in user space, integrate watch-dog threads.
Step 5: Simulate and Test
- Write stress tests that:
- Start many threads acquiring locks in randomized orders (still following your documented rules).
- Intentionally break the ordering in a test-only build to ensure your detection mechanisms fire.
- Use tools like
helgrind,TSan, or kernel lock validators where applicable.
Related Problems
Livelock
Threads keep changing state but make no progress:Starvation
A thread never gets the resources it needs:Priority Inversion
Low-priority thread blocks high-priority thread:- Priority inheritance: L temporarily gets H’s priority
- Priority ceiling: Lock has priority, holder gets it
Interview Deep Dive Questions
Q1: Design a system that's deadlock-free
Q1: Design a system that's deadlock-free
Answer:For database transactions:For application locks:
Q2: Walk through Banker's algorithm with an example
Q2: Walk through Banker's algorithm with an example
Answer:Request evaluation:
Q3: How does a database handle deadlocks?
Q3: How does a database handle deadlocks?
Answer:PostgreSQL approach:Prevention strategies:
- Wait-for graph: Track which transaction waits for which
-
Deadlock detection: Periodically (every
deadlock_timeout, default 1s) check for cycles -
Victim selection: Abort transaction that:
- Has done least work (youngest)
- Holds fewest locks
- Is easiest to rollback
Q4: Explain priority inversion with a real-world example
Q4: Explain priority inversion with a real-world example
Answer:Mars Pathfinder (1997):Solutions implemented:Mars Pathfinder fix: Enabled priority inheritance on VxWorks mutexes via remote patch upload!
- Priority Inheritance:
- Priority Ceiling Protocol:
Q5: Implement a deadlock detection system
Q5: Implement a deadlock detection system
Answer:
Practice Exercises
Key Takeaways
Four Conditions
Mutual exclusion, hold-and-wait, no preemption, circular wait. Break any one.
Lock Ordering Works
Most practical prevention. Define consistent order, always follow it.
Detection + Recovery
For complex systems. Run detection periodically, abort victim.
Databases Handle It
Built-in deadlock detection. Abort youngest transaction on cycle.
Next: Inter-Process Communication →