Skip to main content

Operating Systems Interview Preparation

This guide covers the most frequently asked OS interview questions at top tech companies, organized by topic with detailed answers and common follow-ups.
Target: Senior/Staff Engineering Interviews
Companies: FAANG, startups, systems companies
Preparation Time: 20-30 hours across all topics

Interview Question Patterns

┌─────────────────────────────────────────────────────────────────┐
│                    OS INTERVIEW CATEGORIES                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  1. CONCEPTUAL (40%)                                            │
│     "Explain how virtual memory works"                          │
│     "What happens when you call malloc?"                        │
│                                                                  │
│  2. DESIGN (30%)                                                │
│     "Design a thread pool"                                      │
│     "Implement a simple scheduler"                              │
│                                                                  │
│  3. DEBUGGING (20%)                                             │
│     "This program deadlocks. Why?"                              │
│     "Why is this server slow?"                                  │
│                                                                  │
│  4. LINUX-SPECIFIC (10%)                                        │
│     "Explain cgroups"                                           │
│     "What is /proc?"                                            │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Top 20 OS Interview Questions

Process & Threading

Answer:
AspectProcessThread
MemorySeparate address spaceShared address space
CreationExpensive (fork)Cheap
Context switchExpensive (TLB flush)Cheap
CommunicationIPC neededShared memory
Crash impactIsolatedAffects all threads
When to use each:
  • Processes: Isolation needed (security), different languages, crash isolation
  • Threads: Shared state, low latency communication, same codebase
Follow-up: “How does fork() work?”
  • Creates copy of parent’s address space (copy-on-write)
  • Child gets PID, returns 0 from fork()
  • Parent gets child’s PID, returns child PID from fork()
Answer (step-by-step):
  1. Shell parses command, finds executable in PATH
  2. fork(): Create child process
    • Copy page tables (COW)
    • Copy file descriptors
    • New PID, same code
  3. execve(): Replace child with new program
    • Load ELF headers
    • Set up new address space
    • Map code, data sections
    • Set up stack with args/env
    • Jump to _start (entry point)
  4. Dynamic linking: ld.so loads shared libraries
  5. main(): C runtime calls your main()
  6. exit(): Cleanup, return status to parent
  7. wait(): Parent reaps child, gets exit status
Key syscalls: fork, execve, wait4, exit_group
Answer:What is saved:
  • CPU registers (general purpose, PC, SP)
  • Floating point/SIMD registers
  • Kernel stack pointer
  • Page table pointer (CR3 on x86)
Steps:
1. Save current task's registers to its task_struct
2. Save stack pointer
3. Select next task (scheduler)
4. Restore next task's stack pointer
5. Restore next task's registers
6. If different process: switch page tables (TLB flush)
7. Return to user mode
Cost:
  • Thread switch: ~1-2 μs
  • Process switch: ~5-10 μs (TLB flush)
Why it matters: Too many context switches = poor performance

Memory Management

Answer:Concept: Each process sees private address space (illusion of having all memory)How it works:
Virtual Address → MMU → Page Tables → Physical Address

┌────────────────┐     ┌─────────────────┐     ┌────────────────┐
│ Process A      │     │   Page Table A  │     │ Physical RAM   │
│ addr: 0x1000   │────►│ 0x1→phy 0x8000 │────►│ Frame 0x8000   │
└────────────────┘     └─────────────────┘     └────────────────┘

┌────────────────┐     ┌─────────────────┐     
│ Process B      │     │   Page Table B  │     Same physical
│ addr: 0x1000   │────►│ 0x1→phy 0x9000 │────►frame or different
└────────────────┘     └─────────────────┘     
Benefits:
  • Process isolation
  • Memory overcommit
  • Demand paging (don’t load until needed)
  • Shared libraries (one physical copy, many mappings)
  • Easy relocation
Components:
  • Page tables (software)
  • MMU (hardware)
  • TLB (cache for page table lookups)
Answer:Types:
  1. Minor fault: Page in memory, just needs mapping
  2. Major fault: Page on disk, I/O required
  3. Invalid: Segmentation fault
Flow:
1. CPU accesses address
2. MMU checks page table → entry invalid/not present
3. CPU raises page fault exception
4. Kernel handler runs:
   - Find VMA for address
   - Check permissions
   - If valid:
     - Minor: Update page table
     - Major: Read from disk, then update
   - If invalid: SIGSEGV
5. Return to faulting instruction (retry)
Common causes:
  • Demand paging (first access)
  • Copy-on-write
  • Stack growth
  • Swapped out page
  • Null pointer dereference (→ SIGSEGV)
Answer:Layers:
malloc(100)           ← User request


User allocator        ← glibc (ptmalloc/jemalloc)
(maintains free list, caches)

     ▼ brk()/mmap() if needed
Kernel
(VMA management, demand paging)
Small allocations (less than 128KB typically):
  • Come from pre-allocated heap
  • Free list management
  • May use sbrk() to extend heap
Large allocations (>128KB):
  • Use mmap() directly
  • Return to OS on free
Allocator strategies:
  • ptmalloc: Per-thread arenas, reduces contention
  • jemalloc: Better for multi-threaded, used by Firefox
  • tcmalloc: Thread-caching, used by Go
Follow-up: “What happens when you free()?”
  • Small: Returns to free list (not to OS)
  • Large (mmap’d): munmap() returns to OS
  • May trigger coalescing of free blocks

Synchronization

Answer:Four Coffman conditions (all must hold):
  1. Mutual exclusion: Resource can’t be shared
  2. Hold and wait: Holding one, waiting for another
  3. No preemption: Can’t forcibly take resource
  4. Circular wait: A→B→C→A waiting cycle
Prevention strategies:
StrategyMethodTradeoff
Lock orderingAlways acquire in same orderRequires discipline
Lock timeoutGive up after timeoutMay fail needlessly
Try-lockNon-blocking acquireRetry logic needed
Single lockOne lock for allPoor concurrency
Lock-freeAtomic operations onlyComplex to implement
Example fix:
// BAD: Thread 1 locks A then B, Thread 2 locks B then A

// GOOD: Always lock in order (A before B)
void transfer(Account *from, Account *to) {
    Account *first = (from < to) ? from : to;
    Account *second = (from < to) ? to : from;
    lock(first);
    lock(second);
    // transfer...
    unlock(second);
    unlock(first);
}
Answer:
PrimitivePurposeCountUse Case
MutexMutual exclusion0/1Protect critical section
SemaphoreResource counting0-NLimit concurrent access
Cond VarWait for conditionN/AProducer-consumer
Mutex:
pthread_mutex_lock(&mutex);
// Critical section - only one thread
pthread_mutex_unlock(&mutex);
Semaphore:
sem_wait(&sem);  // Decrement, block if 0
// Access resource (up to N concurrent)
sem_post(&sem);  // Increment
Condition Variable:
pthread_mutex_lock(&mutex);
while (!condition) {
    pthread_cond_wait(&cond, &mutex);  // Releases mutex while waiting
}
// Condition is now true
pthread_mutex_unlock(&mutex);

// In another thread:
pthread_mutex_lock(&mutex);
condition = true;
pthread_cond_signal(&cond);  // Wake one waiter
pthread_mutex_unlock(&mutex);
Key insight: Condition variables are always used with a mutex and a predicate (while loop).
Answer:Spinlock: Busy-wait for lock instead of sleeping
while (atomic_test_and_set(&lock) == 1) {
    // Spin! CPU is busy waiting
}
// Have lock
// ... critical section ...
atomic_clear(&lock);
When to use:
  • Very short critical sections (less than 1μs)
  • Interrupt handlers (can’t sleep)
  • Lock held time shorter than context switch time
  • Known low contention
When NOT to use:
  • Long critical sections
  • High contention
  • User space (usually)
  • Single CPU system
Comparison:
AspectSpinlockMutex
WaitingBusy-waitSleep
CPU use100% while waiting0% while waiting
Context switchNoneYes
Best forVery short holdsLonger holds
Kernel useVery commonLess common

Scheduling

Answer:
AlgorithmDescriptionProsCons
FCFSFirst come first servedSimpleConvoy effect
SJFShortest job firstOptimal avg waitNeed to know times
Round RobinTime slicesFairContext switch overhead
PriorityBased on priorityImportant tasks firstStarvation
MultilevelMultiple queuesFlexibleComplex
CFSFair share of CPUFair, no starvationOverhead
Linux CFS (Completely Fair Scheduler):
  • Each task tracks “virtual runtime” (vruntime)
  • Lower vruntime = hasn’t had fair share = run it next
  • Red-black tree for O(log n) task selection
  • Nice values adjust time slices, not priority
Key metrics:
  • Throughput: Jobs completed per time
  • Turnaround: Submit to completion
  • Wait time: Time in ready queue
  • Response time: Submit to first run
Answer:Problem:
Low priority task (L) holds lock
High priority task (H) needs lock → blocks
Medium priority task (M) preempts L

Result: H waits for M, even though H > M priority!
Solutions:
  1. Priority Inheritance:
    • L temporarily gets H’s priority while holding lock
    • L runs, releases lock, H continues
    • Used in Linux (rt_mutex)
  2. Priority Ceiling:
    • Lock has ceiling = max priority of any user
    • Acquiring task gets ceiling priority
    • Prevents other tasks from preempting
Real example: Mars Pathfinder (1997)
  • Low-priority task held bus mutex
  • High-priority task blocked
  • Watchdog timer triggered reset
  • Fixed by enabling priority inheritance

File Systems & I/O

Answer:System call path:
read(fd, buffer, size)


VFS (Virtual File System)
- Translates to inode operations


Page Cache (Check if cached)
─────► HIT: Copy to user buffer, return
─────► MISS: Continue...


Filesystem (ext4, xfs)
- Map file offset → block numbers


Block Layer
- I/O scheduling, merging


Device Driver
- Issue commands to disk


Hardware (SSD/HDD)
- DMA data to memory


Complete I/O
- Wake up process
- Copy to user buffer
Optimizations:
  • Page cache (avoid disk for repeated reads)
  • Read-ahead (prefetch next blocks)
  • I/O merging (combine adjacent requests)
Answer:Synchronous I/O:
// Process blocks until I/O completes
ssize_t n = read(fd, buffer, size);
// Control returns here after disk I/O
Asynchronous I/O:
// Submit request, continue immediately
struct aiocb cb = {...};
aio_read(&cb);
// Do other work...
// Later: check completion or get signal
Modern: io_uring (Linux 5.1+):
// Submit multiple I/O requests
io_uring_prep_read(sqe, fd, buffer, size, offset);
io_uring_submit(&ring);

// Check completions (batched)
io_uring_wait_cqe(&ring, &cqe);
Comparison:
AspectSyncAsync (AIO)io_uring
BlockingYesNoNo
System calls1 per I/O2 per I/OBatched
ComplexitySimpleComplexModerate
PerformanceOKBetterBest

Linux Specific

Answer:
Aspectselectpollepoll
Max FDs1024 (FD_SETSIZE)UnlimitedUnlimited
Passing FDsCopy each callCopy each callRegister once
CheckingO(n) scanO(n) scanO(1) ready list
Edge triggerNoNoYes
select:
fd_set readfds;
FD_ZERO(&readfds);
FD_SET(fd, &readfds);
select(fd+1, &readfds, NULL, NULL, NULL);
if (FD_ISSET(fd, &readfds)) { ... }
poll:
struct pollfd fds[1];
fds[0].fd = fd;
fds[0].events = POLLIN;
poll(fds, 1, -1);
if (fds[0].revents & POLLIN) { ... }
epoll:
int epfd = epoll_create1(0);
struct epoll_event ev = {.events = EPOLLIN, .data.fd = fd};
epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev);

struct epoll_event events[10];
int n = epoll_wait(epfd, events, 10, -1);
// Only returns ready FDs!
Use epoll for:
  • High-performance servers
  • Many connections
  • Production systems
Answer:Namespaces (isolation):
NamespaceIsolates
PIDProcess IDs (PID 1 inside container)
NETNetwork stack, interfaces
MNTFilesystem mounts
UTSHostname
IPCShared memory, semaphores
USERUID/GID mapping
CGROUPCgroup root
// Create new namespace
clone(..., CLONE_NEWPID | CLONE_NEWNET, ...);

// Or join existing
setns(fd, CLONE_NEWPID);
Cgroups (limits):
ControllerLimits
memoryRAM usage
cpuCPU time
blkioDisk I/O
pidsNumber of processes
# Create cgroup
mkdir /sys/fs/cgroup/memory/mycontainer
echo 100M > /sys/fs/cgroup/memory/mycontainer/memory.max
echo $$ > /sys/fs/cgroup/memory/mycontainer/cgroup.procs
Docker = namespaces + cgroups + union filesystem + OCI runtime

Common Design Questions

Requirements:
  • Fixed number of worker threads
  • Task queue
  • Graceful shutdown
Implementation:
typedef struct {
    pthread_t *threads;
    int num_threads;
    
    // Task queue
    task_t *queue;
    int head, tail;
    int queue_size;
    
    // Synchronization
    pthread_mutex_t lock;
    pthread_cond_t not_empty;
    pthread_cond_t not_full;
    
    bool shutdown;
} thread_pool_t;

void *worker(void *arg) {
    thread_pool_t *pool = arg;
    
    while (1) {
        pthread_mutex_lock(&pool->lock);
        
        while (pool->head == pool->tail && !pool->shutdown) {
            pthread_cond_wait(&pool->not_empty, &pool->lock);
        }
        
        if (pool->shutdown) {
            pthread_mutex_unlock(&pool->lock);
            break;
        }
        
        task_t task = pool->queue[pool->head];
        pool->head = (pool->head + 1) % pool->queue_size;
        
        pthread_cond_signal(&pool->not_full);
        pthread_mutex_unlock(&pool->lock);
        
        // Execute task
        task.func(task.arg);
    }
    return NULL;
}

void submit(thread_pool_t *pool, void (*func)(void*), void *arg) {
    pthread_mutex_lock(&pool->lock);
    
    while ((pool->tail + 1) % pool->queue_size == pool->head) {
        pthread_cond_wait(&pool->not_full, &pool->lock);
    }
    
    pool->queue[pool->tail] = (task_t){func, arg};
    pool->tail = (pool->tail + 1) % pool->queue_size;
    
    pthread_cond_signal(&pool->not_empty);
    pthread_mutex_unlock(&pool->lock);
}
Follow-ups:
  • How to handle task priorities? Use priority queue
  • How to handle task cancellation? Cancellation tokens
  • How to tune thread count? CPU cores, task type (I/O vs CPU)
Simple bump allocator:
typedef struct {
    void *base;
    size_t size;
    size_t offset;
} arena_t;

void *arena_alloc(arena_t *arena, size_t size) {
    // Align to 8 bytes
    size = (size + 7) & ~7;
    
    if (arena->offset + size > arena->size) {
        return NULL;
    }
    
    void *ptr = (char*)arena->base + arena->offset;
    arena->offset += size;
    return ptr;
}

void arena_reset(arena_t *arena) {
    arena->offset = 0;
}
Free list allocator:
typedef struct block {
    size_t size;
    struct block *next;
} block_t;

block_t *free_list = NULL;

void *my_malloc(size_t size) {
    block_t **prev = &free_list;
    block_t *curr = free_list;
    
    // First fit
    while (curr) {
        if (curr->size >= size) {
            *prev = curr->next;
            return (char*)curr + sizeof(block_t);
        }
        prev = &curr->next;
        curr = curr->next;
    }
    
    // No fit, get from OS
    block_t *new = sbrk(size + sizeof(block_t));
    new->size = size;
    return (char*)new + sizeof(block_t);
}

void my_free(void *ptr) {
    block_t *block = (block_t*)((char*)ptr - sizeof(block_t));
    block->next = free_list;
    free_list = block;
}
Real allocators add:
  • Coalescing free blocks
  • Size classes (slab-like)
  • Per-thread caches
  • Best fit or other strategies
Token bucket algorithm:
typedef struct {
    double tokens;
    double max_tokens;
    double refill_rate;  // tokens per second
    struct timespec last_refill;
    pthread_mutex_t lock;
} rate_limiter_t;

bool acquire(rate_limiter_t *rl, double tokens) {
    pthread_mutex_lock(&rl->lock);
    
    // Refill tokens based on elapsed time
    struct timespec now;
    clock_gettime(CLOCK_MONOTONIC, &now);
    
    double elapsed = (now.tv_sec - rl->last_refill.tv_sec) +
                    (now.tv_nsec - rl->last_refill.tv_nsec) / 1e9;
    
    rl->tokens += elapsed * rl->refill_rate;
    if (rl->tokens > rl->max_tokens) {
        rl->tokens = rl->max_tokens;
    }
    rl->last_refill = now;
    
    // Try to acquire
    if (rl->tokens >= tokens) {
        rl->tokens -= tokens;
        pthread_mutex_unlock(&rl->lock);
        return true;
    }
    
    pthread_mutex_unlock(&rl->lock);
    return false;
}
Sliding window:
// Track requests in time window
// More accurate for bursts but more memory
typedef struct {
    int *requests;      // Circular buffer
    int window_size;    // Seconds
    int max_requests;
    struct timespec window_start;
} sliding_window_t;
Distributed rate limiting:
  • Use Redis for shared state
  • Approximate algorithms (e.g., sliding window log)
  • Accept some inconsistency for performance

Debugging Scenarios

Approach:
  1. Attach debugger / get stack traces:
    $ gdb -p <pid>
    (gdb) thread apply all bt
    
    # Or without GDB:
    $ pstack <pid>
    
  2. Check for deadlock:
    # Look for mutex wait in stack traces
    # Multiple threads waiting on locks = deadlock candidate
    
    # Check lock order: Does thread 1 hold A, wait for B?
    #                   Does thread 2 hold B, wait for A?
    
  3. Check for infinite loop:
    $ top -H -p <pid>
    # Look for thread at 100% CPU
    
    $ perf top -p <pid>
    # What function is hot?
    
  4. Check for I/O block:
    $ cat /proc/<pid>/stack
    # Look for I/O wait syscalls
    
    $ strace -p <pid>
    # What syscall is it stuck on?
    
  5. Check for network:
    $ ss -tunapee | grep <pid>
    # Stuck connections?
    
    $ netstat -an | grep CLOSE_WAIT
    # Leaked sockets?
    
Common causes:
  • Deadlock (lock order violation)
  • Network timeout too high
  • Database connection exhausted
  • Full disk / slow I/O
  • Signal handling issue
Systematic approach:
  1. CPU check:
    $ top
    $ mpstat -P ALL 1
    
    # High CPU? Which process?
    # Low CPU but slow? Waiting on I/O?
    
  2. Memory check:
    $ free -h
    $ cat /proc/meminfo
    $ vmstat 1
    
    # Swapping? OOM pressure?
    
  3. Disk I/O check:
    $ iostat -x 1
    $ iotop
    
    # Disk at 100% utilization?
    # High await times?
    
  4. Network check:
    $ ss -s
    $ netstat -i
    $ sar -n DEV 1
    
    # High retransmits? Connection errors?
    
  5. Application profiling:
    $ perf record -g -p <pid> sleep 30
    $ perf report
    
    # Where is time spent?
    
  6. System calls:
    $ strace -c -p <pid>
    
    # Which syscalls are slow?
    
Common bottlenecks:
  • Lock contention
  • Database queries
  • Network latency
  • GC pauses (Java, Go)
  • Disk I/O (logging, temp files)
  • Connection exhaustion

Quick Reference Cheat Sheet

Key Numbers to Know

MetricApproximate Value
L1 cache access1 ns
L2 cache access4 ns
L3 cache access12 ns
RAM access100 ns
SSD read100 μs
HDD seek10 ms
Context switch1-10 μs
Page fault (minor)5-10 μs
Page fault (major)1-10 ms
System call100-200 ns

System Call Quick Reference

// Process
fork()      // Create child process
execve()    // Replace with new program
wait()      // Wait for child
exit()      // Terminate process

// Files
open()      // Open file
read()      // Read from fd
write()     // Write to fd
close()     // Close fd
stat()      // Get file metadata

// Memory
mmap()      // Map memory/files
munmap()    // Unmap
brk()       // Extend heap

// Network
socket()    // Create socket
bind()      // Bind to address
listen()    // Listen for connections
accept()    // Accept connection
connect()   // Connect to remote

Common File Paths

/proc/meminfo      # Memory statistics
/proc/cpuinfo      # CPU information
/proc/<pid>/status # Process status
/proc/<pid>/maps   # Memory mappings
/proc/<pid>/fd/    # Open file descriptors
/sys/class/net/    # Network interfaces
/sys/block/        # Block devices
/etc/passwd        # User accounts
/etc/fstab         # Mount configuration

Study Plan

1

Week 1: Fundamentals

Process/threads, memory management, virtual memory
2

Week 2: Synchronization

Mutexes, deadlocks, condition variables, lock-free
3

Week 3: Scheduling & I/O

CPU scheduling, file systems, disk I/O
4

Week 4: Linux Internals

System calls, kernel modules, containers
5

Week 5: Practice

Implement thread pool, allocator, rate limiter

Key Takeaways

Know the Fundamentals

Process vs thread, virtual memory, page faults - these come up constantly.

Practice Debugging

“Why is this slow/hanging?” questions are common. Know your tools.

Design Questions

Be ready to implement thread pool, allocator, scheduler from scratch.

Know Linux Specifics

epoll vs select, namespaces, cgroups - expected for systems roles.

Next: Case Studies