The Linux kernel is massively concurrent - multiple CPUs executing kernel code simultaneously, interrupts preempting at any moment, and softirqs running in parallel. Understanding synchronization primitives is essential for reading kernel code and debugging race conditions.
Interview Frequency: Very High (critical for systems roles) Key Topics: Spinlocks, mutexes, RCU, memory barriers, deadlock debugging Time to Master: 14-16 hours
// Basic (use in process context when not sharing with interrupts)spin_lock(&lock);spin_unlock(&lock);// Disable softirqs (use when sharing with softirq context)spin_lock_bh(&lock); // Disable bottom halvesspin_unlock_bh(&lock); // Re-enable bottom halves// Disable all interrupts (use when sharing with hardirq context)unsigned long flags;spin_lock_irqsave(&lock, flags); // Save IRQ state, disable IRQs// ... critical section ...spin_unlock_irqrestore(&lock, flags); // Restore IRQ state// Disable interrupts (unconditionally)spin_lock_irq(&lock); // Assumes IRQs were enabledspin_unlock_irq(&lock); // Re-enables IRQs// Warning: Don't nest irq variants wrong!spin_lock_irq(&lock1);spin_lock_irq(&lock2); // WRONG! First unlock will re-enable IRQsspin_unlock_irq(&lock2); // IRQs now enabled, but lock1 still held!spin_unlock_irq(&lock1);// Correct: Use irqsave for inner locksspin_lock_irq(&lock1);spin_lock(&lock2); // IRQs already disabledspin_unlock(&lock2);spin_unlock_irq(&lock1);
┌─────────────────────────────────────────────────────────────────────────────┐│ WHICH SPINLOCK VARIANT TO USE? │├─────────────────────────────────────────────────────────────────────────────┤│ ││ Q: Is the lock ever acquired in hardirq context? ││ │ ││ ├─ YES → Always use spin_lock_irqsave() / spin_unlock_irqrestore() ││ │ (Even from process context - must disable IRQs to prevent ││ │ deadlock if interrupt tries to acquire same lock) ││ │ ││ └─ NO ──► Is the lock ever acquired in softirq/tasklet context? ││ │ ││ ├─ YES → Use spin_lock_bh() / spin_unlock_bh() from process ctx ││ │ Use spin_lock() / spin_unlock() from softirq ctx ││ │ ││ └─ NO ──► Use spin_lock() / spin_unlock() ││ (But consider if a mutex would be better) ││ ││ DANGER: Never sleep while holding a spinlock! ││ The CPU spins, so other CPUs waste time waiting. ││ │└─────────────────────────────────────────────────────────────────────────────┘
For data with very frequent reads and rare writes:
Copy
#include <linux/seqlock.h>static DEFINE_SEQLOCK(my_seqlock);// Writer (takes exclusive lock)write_seqlock(&my_seqlock);// ... modify data ...write_sequnlock(&my_seqlock);// Reader (lockless, may need to retry)unsigned int seq;do { seq = read_seqbegin(&my_seqlock); // ... read data into local variables ...} while (read_seqretry(&my_seqlock, seq));// If read_seqretry returns true, a write occurred during read// and we must retry
RCU is Linux’s secret weapon for read-mostly data structures. It allows readers to access data without locks while writers make copies and update pointers atomically.
// Without barriers, this might be reordered!int ready = 0;int data = 0;// CPU 1 (writer)data = 42; // Might execute after ready=1!ready = 1;// CPU 2 (reader)while (!ready); // Might read ready=1 but...print(data); // ...data still shows 0!// With barriers// CPU 1 (writer)data = 42;smp_wmb(); // Write barrier: data visible before readyready = 1;// CPU 2 (reader)while (!ready);smp_rmb(); // Read barrier: read data after readyprint(data); // Now guaranteed to see 42
#include <linux/barrier.h>// Full memory barrier (both reads and writes)mb(); // Barrier for all memory typessmp_mb(); // Only needed on SMP systems// Write barrier (writes before cannot pass writes after)wmb();smp_wmb();// Read barrier (reads before cannot pass reads after)rmb();smp_rmb();// Acquire/Release semantics (lighter than full barriers)smp_load_acquire(&variable); // All reads after this see writes before corresponding releasesmp_store_release(&variable, value); // All writes before this visible to loads after acquire// Compiler-only barrier (doesn't affect CPU)barrier(); // Prevents compiler reorderingREAD_ONCE(x); // Prevent compiler from caching/reordering readWRITE_ONCE(x, val); // Prevent compiler from tearing/reordering write
=============================================WARNING: possible recursive locking detected---------------------------------------------kworker/0:1/1234 is trying to acquire lock: (&my_lock){+.+.}, at: my_function+0x20/0x100but task is already holding lock: (&my_lock){+.+.}, at: my_function+0x10/0x100other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&my_lock); lock(&my_lock); <-- Recursive! *** DEADLOCK ***
Better for longer waits (sleeping is more efficient than spinning)
Key difference: Spinlocks busy-wait (waste CPU), mutexes sleep (yield CPU). Spinlocks are faster for very short holds; mutexes are better for anything longer.
Q: Explain RCU and when you'd use it
Answer:RCU (Read-Copy-Update) provides lockless reads for read-mostly data:How it works:
Readers use rcu_read_lock() - just disables preemption, no actual lock
Writers wait for grace period (all pre-existing readers finish)
Writers free old data
Use when:
Reads vastly outnumber writes
Can afford to copy data on write
Updates are pointer-based (not field-by-field)
Examples: Routing tables, configuration data, module listsAdvantage: Zero overhead for readers on the fast path
Q: What causes a deadlock and how do you prevent it?
Answer:Four conditions for deadlock (all must be true):
Mutual exclusion: Resource can only be held by one thread
Hold and wait: Thread holds one lock while waiting for another
No preemption: Locks cannot be forcibly taken
Circular wait: A→B→C→A dependency chain
Prevention strategies:
Lock ordering: Always acquire locks in the same global order
Lock hierarchy: Document and enforce lock levels
Try-lock with backoff: Use trylock, release and retry if fails
Avoid holding locks while calling unknown code
Detection: Use lockdep (CONFIG_PROVE_LOCKING=y)
Q: Why do you need memory barriers?
Answer:Problem: CPUs and compilers reorder memory operations for performance. On multi-core systems, this can cause one CPU to see operations in a different order than another CPU performed them.Example without barriers:
Copy
// CPU 1 // CPU 2data = 42; while (!flag);flag = 1; print(data); // Might print 0!
Why it fails: CPU 1 might reorder (flag=1 before data=42), or CPU 2 might read stale data from cache.Solution:
Copy
// CPU 1 // CPU 2data = 42; while (!READ_ONCE(flag));smp_wmb(); smp_rmb();WRITE_ONCE(flag, 1); print(data); // Now guaranteed 42