Skip to main content

Deadlocks

A deadlock is a situation where two or more threads are waiting indefinitely for resources held by each other. Understanding deadlocks is essential for senior engineers — they can bring entire systems down.
Interview Frequency: High
Key Topics: Four conditions, Banker’s algorithm, prevention strategies
Time to Master: 6-8 hours

Deadlock Fundamentals

Classic Deadlock Scenario

Deadlock Example

The Four Necessary Conditions

Coffman Conditions: ALL four must hold for deadlock to occur. Breaking ANY one prevents deadlock.

1. Mutual Exclusion

Resources can only be held by one thread at a time.Example: A mutex can only be held by one thread.

2. Hold and Wait

Thread holds resources while waiting for others.Example: Thread holds mutex1 while waiting for mutex2.

3. No Preemption

Resources can only be released voluntarily.Example: Can’t force a thread to release its mutex.

4. Circular Wait

Circular chain of threads waiting for each other.Example: A waits for B, B waits for A.

Resource Allocation Graph

Visualize resource allocation to detect deadlocks: Resource Allocation Graph Detection rule:
  • Single instance per resource: Cycle = Deadlock
  • Multiple instances: Cycle is necessary but not sufficient

Deadlock Prevention

Break one of the four conditions:

1. Break Mutual Exclusion

// Use lock-free data structures
// Example: Atomic counter instead of mutex-protected counter

#include <stdatomic.h>
atomic_int counter = 0;

void increment() {
    atomic_fetch_add(&counter, 1);  // No mutex needed
}
Limitation: Not all resources can be shared (e.g., printer, write access)

2. Break Hold and Wait

// Acquire ALL locks at once, or none

typedef struct {
    pthread_mutex_t mutex;
    int id;
} resource_t;

bool acquire_all(resource_t *resources[], int count) {
    // Try to lock all
    for (int i = 0; i < count; i++) {
        if (pthread_mutex_trylock(&resources[i]->mutex) != 0) {
            // Failed to acquire one, release all held
            for (int j = 0; j < i; j++) {
                pthread_mutex_unlock(&resources[j]->mutex);
            }
            return false;  // Retry later
        }
    }
    return true;  // Got all resources
}
Limitation: Low resource utilization, may cause starvation

3. Break No Preemption

// If can't get resource, release what you have

bool acquire_with_preemption(pthread_mutex_t *m1, pthread_mutex_t *m2) {
    while (1) {
        pthread_mutex_lock(m1);
        
        if (pthread_mutex_trylock(m2) == 0) {
            return true;  // Got both
        }
        
        // Can't get m2, release m1 (preempt self)
        pthread_mutex_unlock(m1);
        
        // Back off to avoid livelock
        usleep(random() % 1000);
    }
}
Limitation: Only works for restartable resources (not printers, etc.)

4. Break Circular Wait (Lock Ordering)

// Always acquire locks in consistent order

// Define global ordering by resource address
void safe_lock_two(pthread_mutex_t *a, pthread_mutex_t *b) {
    if (a < b) {
        pthread_mutex_lock(a);
        pthread_mutex_lock(b);
    } else {
        pthread_mutex_lock(b);
        pthread_mutex_lock(a);
    }
}

// Or use explicit priority
#define LOCK_PRIORITY_DB 1
#define LOCK_PRIORITY_CACHE 2
#define LOCK_PRIORITY_LOG 3

// Always lock in priority order: DB → Cache → Log
This is the most practical prevention method!

Deadlock Avoidance

Allow the system to deny requests that COULD lead to deadlock.

Safe State

A state is safe if there exists a sequence where all processes can complete: Safe vs Unsafe State

Banker’s Algorithm

// System state
int n_processes, n_resources;
int available[n_resources];          // Available resources
int max[n_processes][n_resources];   // Maximum demand
int allocation[n_processes][n_resources];  // Currently allocated
int need[n_processes][n_resources];  // max - allocation

bool is_safe_state() {
    bool finish[n_processes] = {false};
    int work[n_resources];
    memcpy(work, available, sizeof(available));
    
    int completed = 0;
    while (completed < n_processes) {
        bool found = false;
        
        for (int i = 0; i < n_processes; i++) {
            if (finish[i]) continue;
            
            // Check if process i can finish with available resources
            bool can_finish = true;
            for (int j = 0; j < n_resources; j++) {
                if (need[i][j] > work[j]) {
                    can_finish = false;
                    break;
                }
            }
            
            if (can_finish) {
                // Simulate process completing
                for (int j = 0; j < n_resources; j++) {
                    work[j] += allocation[i][j];
                }
                finish[i] = true;
                completed++;
                found = true;
            }
        }
        
        if (!found) return false;  // No process can finish
    }
    
    return true;  // Found safe sequence
}

bool request_resources(int process_id, int request[]) {
    // Check if request exceeds need
    for (int j = 0; j < n_resources; j++) {
        if (request[j] > need[process_id][j]) {
            return false;  // Error: exceeds maximum claim
        }
        if (request[j] > available[j]) {
            return false;  // Must wait
        }
    }
    
    // Pretend to allocate
    for (int j = 0; j < n_resources; j++) {
        available[j] -= request[j];
        allocation[process_id][j] += request[j];
        need[process_id][j] -= request[j];
    }
    
    // Check if resulting state is safe
    if (is_safe_state()) {
        return true;  // Grant request
    }
    
    // Not safe, rollback
    for (int j = 0; j < n_resources; j++) {
        available[j] += request[j];
        allocation[process_id][j] -= request[j];
        need[process_id][j] += request[j];
    }
    
    return false;  // Deny request
}
Complexity: O(n² × m) where n = processes, m = resource types

Deadlock Detection

Instead of preventing/avoiding, detect and recover:

Detection Algorithm

// Similar to Banker's safety check
bool detect_deadlock() {
    bool finish[n_processes];
    int work[n_resources];
    
    memcpy(work, available, sizeof(available));
    
    // Initially, only processes with no allocation are "finished"
    for (int i = 0; i < n_processes; i++) {
        bool has_allocation = false;
        for (int j = 0; j < n_resources; j++) {
            if (allocation[i][j] > 0) {
                has_allocation = true;
                break;
            }
        }
        finish[i] = !has_allocation;
    }
    
    // Find processes that can complete
    bool changed;
    do {
        changed = false;
        for (int i = 0; i < n_processes; i++) {
            if (finish[i]) continue;
            
            bool can_finish = true;
            for (int j = 0; j < n_resources; j++) {
                if (request[i][j] > work[j]) {
                    can_finish = false;
                    break;
                }
            }
            
            if (can_finish) {
                finish[i] = true;
                for (int j = 0; j < n_resources; j++) {
                    work[j] += allocation[i][j];
                }
                changed = true;
            }
        }
    } while (changed);
    
    // Any unfinished process is deadlocked
    for (int i = 0; i < n_processes; i++) {
        if (!finish[i]) return true;  // Deadlock detected
    }
    return false;
}

When to Run Detection

StrategyFrequencyTrade-off
Every requestImmediate detectionHigh overhead
Periodic (every N seconds)BalanceMay delay detection
When CPU utilization dropsTriggeredResource dependent

Deadlock Recovery

Once detected, how to break the deadlock:

1. Process Termination

void recover_by_termination() {
    // Option A: Kill all deadlocked processes (drastic)
    for (int i = 0; i < n_processes; i++) {
        if (is_deadlocked(i)) {
            kill_process(i);
        }
    }
    
    // Option B: Kill one at a time until deadlock resolved
    while (detect_deadlock()) {
        int victim = select_victim();  // Based on priority, resources, etc.
        kill_process(victim);
    }
}

int select_victim() {
    // Criteria for victim selection:
    // 1. Priority (kill lowest priority)
    // 2. Resources held (kill one with most)
    // 3. Progress (kill one with least work done)
    // 4. Resources needed (kill one needing most)
    // 5. Interactive vs batch (prefer killing batch)
}

2. Resource Preemption

void recover_by_preemption() {
    // Take resources from some processes, give to others
    
    while (detect_deadlock()) {
        int victim = select_victim();
        
        // Roll back victim to safe state
        rollback_process(victim);
        
        // Reallocate its resources
        for (int j = 0; j < n_resources; j++) {
            available[j] += allocation[victim][j];
            allocation[victim][j] = 0;
        }
    }
}
Challenges:
  • Rolling back may lose work
  • Same process may be selected repeatedly (starvation)
  • Need checkpointing for rollback

Livelock

Threads keep changing state but make no progress:
// LIVELOCK EXAMPLE
void thread_a() {
    while (1) {
        lock(mutex_a);
        if (trylock(mutex_b) == 0) {
            // Got both
            do_work();
            unlock(mutex_b);
            unlock(mutex_a);
            return;
        }
        unlock(mutex_a);  // Release and retry
        // Both threads keep doing this forever!
    }
}

// SOLUTION: Add randomized backoff
void thread_a_fixed() {
    while (1) {
        lock(mutex_a);
        if (trylock(mutex_b) == 0) {
            do_work();
            unlock(mutex_b);
            unlock(mutex_a);
            return;
        }
        unlock(mutex_a);
        usleep(random() % 1000);  // Random backoff
    }
}

Starvation

A thread never gets the resources it needs:
// STARVATION EXAMPLE
// Reader-writer lock with reader preference
// Writers may starve if readers keep coming

// SOLUTION: Fair scheduling
// - Limit how long readers can hold lock
// - Give waiting writers priority after threshold
// - Use fair lock (FIFO ordering)

Priority Inversion

Low-priority thread blocks high-priority thread:
High priority (H):   Waiting for lock held by L
Medium priority (M): Running (doesn't need lock)
Low priority (L):    Holds lock, preempted by M

H is blocked by M (indirectly)!
Solutions:
  • Priority inheritance: L temporarily gets H’s priority
  • Priority ceiling: Lock has priority, holder gets it

Interview Deep Dive Questions

Answer:For database transactions:
-- 1. Lock ordering by table name/ID
-- Always lock accounts before orders
BEGIN;
SELECT * FROM accounts WHERE id = 1 FOR UPDATE;
SELECT * FROM orders WHERE account_id = 1 FOR UPDATE;
-- Work...
COMMIT;

-- 2. Lock timeout
SET lock_timeout = '5s';
-- Transaction aborts if can't get lock in 5s

-- 3. Deadlock detection (PostgreSQL does this)
-- Detect cycles, abort youngest transaction
For application locks:
// 1. Lock ordering
// Define consistent order based on resource ID
void transfer(Account *from, Account *to, int amount) {
    Account *first = from->id < to->id ? from : to;
    Account *second = from->id < to->id ? to : from;
    
    lock(&first->mutex);
    lock(&second->mutex);
    // Transfer...
    unlock(&second->mutex);
    unlock(&first->mutex);
}

// 2. Try-lock with timeout
bool acquire_or_abort(mutex_t *locks[], int count) {
    for (int i = 0; i < count; i++) {
        if (!try_lock_timeout(&locks[i], 100)) {
            // Timeout, release all and abort
            for (int j = 0; j < i; j++) {
                unlock(locks[j]);
            }
            return false;
        }
    }
    return true;
}
Answer:
Resources: A=10, B=5, C=7

Process  Allocation   Max      Need     
         A  B  C    A  B  C   A  B  C   
P0       0  1  0    7  5  3   7  4  3   
P1       2  0  0    3  2  2   1  2  2   
P2       3  0  2    9  0  2   6  0  0   
P3       2  1  1    2  2  2   0  1  1   
P4       0  0  2    4  3  3   4  3  1   

Available: A=3, B=3, C=2

Finding safe sequence:

1. Check P0: Need (7,4,3) > Available (3,3,2)? YES → Skip
2. Check P1: Need (1,2,2) ≤ Available (3,3,2)? YES
   → P1 can complete
   → Work = (3,3,2) + (2,0,0) = (5,3,2)

3. Check P0: (7,4,3) > (5,3,2)? YES → Skip
4. Check P2: (6,0,0) > (5,3,2)? YES → Skip  
5. Check P3: (0,1,1) ≤ (5,3,2)? YES
   → P3 can complete
   → Work = (5,3,2) + (2,1,1) = (7,4,3)

6. Check P0: (7,4,3) ≤ (7,4,3)? YES
   → P0 can complete
   → Work = (7,4,3) + (0,1,0) = (7,5,3)

7. Check P2: (6,0,0) ≤ (7,5,3)? YES
   → Work = (7,5,3) + (3,0,2) = (10,5,5)

8. Check P4: (4,3,1) ≤ (10,5,5)? YES
   → P4 can complete

Safe sequence: <P1, P3, P0, P2, P4>
System is in safe state!
Request evaluation:
P1 requests (1, 0, 2):

1. Check: Request ≤ Need? (1,0,2) ≤ (1,2,2)? YES
2. Check: Request ≤ Available? (1,0,2) ≤ (3,3,2)? YES
3. Pretend to allocate:
   Allocation[P1] = (2,0,0) + (1,0,2) = (3,0,2)
   Available = (3,3,2) - (1,0,2) = (2,3,0)
   Need[P1] = (1,2,2) - (1,0,2) = (0,2,0)

4. Check if safe: Run safety algorithm...
   → If safe, grant request
   → If not safe, deny and rollback
Answer:PostgreSQL approach:
  1. Wait-for graph: Track which transaction waits for which
  2. Deadlock detection: Periodically (every deadlock_timeout, default 1s) check for cycles
  3. Victim selection: Abort transaction that:
    • Has done least work (youngest)
    • Holds fewest locks
    • Is easiest to rollback
-- Example deadlock
-- Session 1                    -- Session 2
BEGIN;                          BEGIN;
UPDATE accounts SET             UPDATE accounts SET
  balance = 100                   balance = 200
  WHERE id = 1;                   WHERE id = 2;

-- Session 1 waits              -- Session 2 waits
UPDATE accounts SET             UPDATE accounts SET
  balance = 150                   balance = 250
  WHERE id = 2;                   WHERE id = 1;

-- DEADLOCK! PostgreSQL detects and aborts one:
-- ERROR: deadlock detected
-- DETAIL: Process 1234 waits for ShareLock on transaction 5678;
--         Process 5678 waits for ShareLock on transaction 1234.
-- HINT: See server log for query details.
Prevention strategies:
-- 1. Lock ordering
-- Always update rows in consistent order (by ID)
SELECT * FROM accounts WHERE id IN (1, 2) 
ORDER BY id FOR UPDATE;

-- 2. Lock timeout
SET lock_timeout = '5s';

-- 3. Use NOWAIT
SELECT * FROM accounts WHERE id = 1 FOR UPDATE NOWAIT;
-- Fails immediately if lock not available

-- 4. Advisory locks for complex operations
SELECT pg_advisory_lock(12345);
-- ... operations ...
SELECT pg_advisory_unlock(12345);
Answer:Mars Pathfinder (1997):
System had three threads:
- High priority: Communication task
- Medium priority: Data gathering
- Low priority: Meteorological task (held shared bus lock)

What happened:
1. Low-priority task acquires bus lock
2. High-priority task preempts, needs bus lock, blocks
3. Medium-priority task preempts low-priority
4. Medium runs indefinitely
5. Low can't release lock (preempted)
6. High starves, watchdog timer triggers reset

Result: Spacecraft kept resetting!
Solutions implemented:
  1. Priority Inheritance:
void lock_with_inheritance(mutex_t *m) {
    disable_preemption();
    
    if (m->holder != NULL) {
        // Boost holder's priority to ours
        if (current->priority > m->holder->priority) {
            m->holder->effective_priority = current->priority;
        }
    }
    
    // Now wait for lock
    while (!try_lock(m)) {
        sleep_on_mutex(m);
    }
    
    enable_preemption();
}

void unlock_with_inheritance(mutex_t *m) {
    // Restore original priority
    current->effective_priority = current->base_priority;
    m->holder = NULL;
    wake_waiters(m);
}
  1. Priority Ceiling Protocol:
// Each mutex has ceiling = highest priority of any thread that uses it

void lock_with_ceiling(mutex_t *m) {
    // Immediately boost to ceiling priority
    saved_priority = current->priority;
    current->priority = m->ceiling;
    
    acquire_lock(m);
}

void unlock_with_ceiling(mutex_t *m) {
    release_lock(m);
    current->priority = saved_priority;
}
Mars Pathfinder fix: Enabled priority inheritance on VxWorks mutexes via remote patch upload!
Answer:
// Lock wrapper with deadlock detection

#include <pthread.h>
#include <stdbool.h>

#define MAX_THREADS 100
#define MAX_LOCKS 100

typedef struct {
    pthread_mutex_t internal_mutex;
    int id;
    int holder_tid;  // Thread holding this lock
} tracked_mutex_t;

// Adjacency matrix for wait-for graph
// waits_for[i][j] = true if thread i waits for thread j
bool waits_for[MAX_THREADS][MAX_THREADS];
pthread_mutex_t graph_lock = PTHREAD_MUTEX_INITIALIZER;

bool has_cycle_dfs(int node, bool visited[], bool in_stack[]) {
    visited[node] = true;
    in_stack[node] = true;
    
    for (int i = 0; i < MAX_THREADS; i++) {
        if (waits_for[node][i]) {
            if (!visited[i]) {
                if (has_cycle_dfs(i, visited, in_stack)) {
                    return true;
                }
            } else if (in_stack[i]) {
                return true;  // Back edge = cycle!
            }
        }
    }
    
    in_stack[node] = false;
    return false;
}

bool detect_deadlock() {
    bool visited[MAX_THREADS] = {false};
    bool in_stack[MAX_THREADS] = {false};
    
    for (int i = 0; i < MAX_THREADS; i++) {
        if (!visited[i]) {
            if (has_cycle_dfs(i, visited, in_stack)) {
                return true;
            }
        }
    }
    return false;
}

int tracked_lock(tracked_mutex_t *m) {
    int my_tid = get_thread_id();
    
    // Try to acquire
    if (pthread_mutex_trylock(&m->internal_mutex) == 0) {
        m->holder_tid = my_tid;
        return 0;
    }
    
    // Update wait-for graph
    pthread_mutex_lock(&graph_lock);
    waits_for[my_tid][m->holder_tid] = true;
    
    if (detect_deadlock()) {
        waits_for[my_tid][m->holder_tid] = false;
        pthread_mutex_unlock(&graph_lock);
        return EDEADLK;  // Would deadlock!
    }
    pthread_mutex_unlock(&graph_lock);
    
    // Actually wait
    pthread_mutex_lock(&m->internal_mutex);
    
    // Got lock, update graph
    pthread_mutex_lock(&graph_lock);
    waits_for[my_tid][m->holder_tid] = false;
    m->holder_tid = my_tid;
    pthread_mutex_unlock(&graph_lock);
    
    return 0;
}

Practice Exercises

1

Banker's Simulator

Implement Banker’s algorithm. Test with various request sequences.
2

Deadlock Detector

Build a system that wraps pthread_mutex and detects potential deadlocks.
3

Dining Philosophers

Solve with different strategies: lock ordering, waiter, chandy-misra.
4

Priority Inheritance

Implement mutex with priority inheritance protocol.

Key Takeaways

Four Conditions

Mutual exclusion, hold-and-wait, no preemption, circular wait. Break any one.

Lock Ordering Works

Most practical prevention. Define consistent order, always follow it.

Detection + Recovery

For complex systems. Run detection periodically, abort victim.

Databases Handle It

Built-in deadlock detection. Abort youngest transaction on cycle.

Next: Inter-Process Communication