Process Management
A process is a program in execution — the fundamental unit of work in an operating system. Understanding process management is essential for senior engineering interviews, as it underlies everything from application behavior to container orchestration.Interview Frequency: Very High (asked in 80%+ of OS interviews)
Key Topics: Process states, fork/exec, context switching, PCB
Time to Master: 8-10 hours
Key Topics: Process states, fork/exec, context switching, PCB
Time to Master: 8-10 hours
Process vs Program
Program
- Static entity stored on disk
- Contains code and static data
- Passive — does nothing by itself
- Example:
/usr/bin/python3
Process
- Dynamic instance in execution
- Has runtime state (registers, heap, stack)
- Active — consumes CPU, memory, I/O
- Example: Running Python interpreter with PID 1234
Interview Insight: “A program becomes a process when loaded into memory and given system resources. Multiple processes can run the same program simultaneously.”
Process Memory Layout
Every process has a well-defined memory layout, typically divided into segments. In a 32-bit architecture, this totals 4GB of address space (2^32), usually split into User Space (low memory) and Kernel Space (high memory).Memory Segment Details
| Segment | Direction | Contents | Characteristics |
|---|---|---|---|
| Kernel Space | Top | Kernel code/data | Inaccessible to user mode. Contains PCB, page tables, kernel stack. |
| Stack | Grows Down ↓ | Function calls | Stores local variables, return addresses, stack frames. Auto-managed. |
| Mapping Segment | N/A | Shared libs | Memory mapped files, shared libraries (e.g., libc.so). |
| Heap | Grows Up ↑ | Dynamic allocation | malloc()/new. Manually managed. Fragmentation risk. |
| BSS | Fixed | Uninitialized globals | ”Block Started by Symbol”. Initialized to zero by OS loader. |
| Data | Fixed | Initialized globals | int x = 10;. Read-write static data. |
| Text (Code) | Fixed | Machine code | Read-only to prevent accidental modification. Sharable. |
Process Control Block (PCB)
The PCB (or task_struct in Linux) is the kernel’s data structure representing a process:PCB Information Categories
- Identification
- CPU State
- Memory
- I/O & Files
- PID: Unique process identifier
- PPID: Parent process ID
- UID/GID: User and group ownership
- Session ID: For terminal sessions
Process States
A process transitions through various states during its lifetime:State Definitions
| State | Description | Linux Representation |
|---|---|---|
| New | Process being created | N/A (transient) |
| Ready | Waiting for CPU | TASK_RUNNING (in run queue) |
| Running | Executing on CPU | TASK_RUNNING (current) |
| Blocked/Waiting | Waiting for I/O or event | TASK_INTERRUPTIBLE / TASK_UNINTERRUPTIBLE |
| Zombie | Terminated, waiting for parent | TASK_ZOMBIE |
| Terminated | Fully cleaned up | N/A (removed) |
TASK_INTERRUPTIBLE vs TASK_UNINTERRUPTIBLE:
- Interruptible: Process can be woken by signals (common case)
- Uninterruptible: Must complete I/O first (shows as ‘D’ in
ps— often disk I/O)
Process Creation: fork() and exec()
The Unix process model is elegant: fork() creates a copy, exec() transforms it.fork() — Creating a Child Process
What fork() Actually Does
Copy-on-Write (COW)
Modern systems don’t actually copy all memory immediately:1
Initial State
After fork(), parent and child share the same physical pages marked read-only
2
Write Attempt
When either process tries to write, a page fault occurs
3
Copy Made
Kernel copies only that specific page for the writer
4
Continue
Process continues with its own private copy of that page
exec() Family — Replacing Process Image
Theexec family of functions replaces the current process execution with a new program. The PID remains the same, but the machine code, data, heap, and stack are replaced.
Understanding the Variants
Theexec function name tells you exactly what arguments it expects:
- l (list): Arguments are passed as a list of strings (
arg0, arg1, ..., NULL). - v (vector): Arguments are passed as an array of strings (
argv[]). - p (path): Searches the
$PATHenvironment variable for the executable. - e (environment): Accepts a custom environment variable array.
1. execl() & execv() — Full Path, Default Environment
Use when you have the full path to the binary.2. execlp() & execvp() — Path Search
Use when you want the OS to find the binary (like running a command in shell).3. execle() & execve() — Custom Environment
Use when you need to run a process with specific environment variables (security, isolation).execve is the underlying system call on Linux; all others are library wrappers around it.
| Function | Path Lookups | Args Format | Environment | Usage Scenario |
|---|---|---|---|---|
execl | No | List | Inherited | Hardcoded args |
execlp | Yes | List | Inherited | Shell-like commands |
execle | No | List | Explicit | Security/Custom Env |
execv | No | Array | Inherited | Dynamic args |
execvp | Yes | Array | Inherited | Shell implementation |
execve | No | Array | Explicit | Low-level Syscall |
Context Switching
A context switch is the process of saving one process’s state and restoring another’s.What Gets Saved/Restored
Context Switch Overhead
Context Switch Overhead
Register Save/Restore (0.1-0.5 μs)
When the kernel switches from Process A to Process B, it must save Process A’s CPU state and load Process B’s CPU state.What Gets Saved
TLB Flush (0.5-2 μs) - The Expensive One
The Translation Lookaside Buffer is a cache of virtual→physical address mappings. Each process has its own address space, so when you switch processes, these mappings become invalid.The Problem
Traditional Solution: Full Flush
Modern Solution: ASID (Address Space Identifiers)
Instead of flushing, tag each TLB entry with which process it belongs to:ASID=57. No flush needed, massive speedup.
Cache Effects (10-100+ μs) - The Silent Killer
This is about your L1/L2/L3 CPU caches going cold.Before Context Switch (Process A running)
After Context Switch (Process B starts)
Real Numbers
- Cache hit: 3-4 cycles (~1 ns)
- Cache miss to RAM: 200-300 cycles (~100 ns)
Scheduler Decision (0.1-1 μs)
The kernel must pick which process runs next. This involves:Mitigation Strategies Explained
1. CPU Pinning - Cache Locality
2. Larger Time Slices - Amortize the Cost
- Small: 1000 × 20 μs = 20 ms wasted/sec (2% overhead)
- Large: 100 × 20 μs = 2 ms wasted/sec (0.2% overhead)
3. User-Space Threading (Green Threads)
Languages like Go use goroutines that switch without kernel involvement:- No TLB flush (same process)
- No cache flush (same process)
- No kernel involvement (no syscall overhead)
- Just save/restore a tiny bit of state
The Big Picture
Context switches aren’t slow because of one thing—it’s death by a thousand cuts:Zombie and Orphan Processes
Zombie Process
A zombie is a terminated process whose parent hasn’t yet calledwait():
ps:
Orphan Process
An orphan is a child whose parent terminated first:Orphans are “adopted” by init (PID 1) or a subreaper process, which will properly reap them when they terminate.
Fork Variants
vfork()
A vfork() is optimized for the fork-then-exec pattern:| Aspect | fork() | vfork() |
|---|---|---|
| Address space | Copied (COW) | Shared with parent |
| Parent execution | Continues | Suspended until exec/_exit |
| Safety | Safe for any use | Dangerous — child can corrupt parent |
| Use case | General | fork + immediate exec |
clone() — Linux’s Swiss Army Knife
Theclone() system call provides fine-grained control over resource sharing:
| Flag | Effect |
|---|---|
CLONE_VM | Share virtual memory |
CLONE_FS | Share filesystem info (cwd, root) |
CLONE_FILES | Share file descriptor table |
CLONE_SIGHAND | Share signal handlers |
CLONE_THREAD | Same thread group (for pthreads) |
CLONE_NEWPID | New PID namespace (containers) |
CLONE_NEWNS | New mount namespace |
Interview Deep Dive Questions
Q1: Explain what happens when you type 'ls' in a shell
Q1: Explain what happens when you type 'ls' in a shell
Complete Answer:
- Shell (bash) reads input “ls” from stdin
- Shell parses the command and arguments
- Shell calls
fork()to create child process- COW creates lightweight copy
- Child process calls
execvp("ls", args)- Kernel loads
/bin/lsexecutable - New code, data, heap, stack are set up
- File descriptors 0,1,2 remain (inherited)
- Kernel loads
- Parent shell calls
waitpid()and blocks - ls process runs, writes to stdout (fd 1)
- ls calls
exit(0), becomes zombie - Parent’s
waitpid()returns, zombie is reaped - Shell displays next prompt
Q2: Why is fork() before exec() expensive?
Q2: Why is fork() before exec() expensive?
Answer:Even with COW, fork() still must:
- Allocate new PID and PCB
- Copy page table entries (not data, but metadata)
- Copy file descriptor table
- Copy signal handlers and other process state
- Set up memory mappings
vfork(): Suspends parent, shares address spaceposix_spawn(): Single call that does fork+exec atomically- Clone with minimal sharing for containers
Q3: Can a zombie process be killed with kill -9?
Q3: Can a zombie process be killed with kill -9?
Answer:No. A zombie is already dead — it’s not running any code.
kill sends signals to running processes.A zombie exists only because:- Its exit status hasn’t been collected by parent
- Its PCB entry and PID are retained for this purpose
- Parent calls
wait()/waitpid() - Kill the parent — orphaned zombies are adopted by init and reaped
- Use
SIGCHLDhandler to auto-reap
Q4: What's the difference between process and thread context switch?
Q4: What's the difference between process and thread context switch?
Answer:
Thread switches within the same process are much cheaper because:
| Aspect | Process Switch | Thread Switch |
|---|---|---|
| Address space | Changes | Same |
| Page table | Switched (TLB flush) | Not changed |
| CPU registers | Saved/restored | Saved/restored |
| Kernel overhead | Higher | Lower |
| Cache effects | Worse (different memory) | Better (shared data) |
| Typical cost | 1-10 μs + cache misses | 0.1-1 μs |
- No page table switch needed
- Shared memory means cached data stays valid
- Only thread-local state needs saving
Q5: Design a system to handle 10,000 concurrent connections
Q5: Design a system to handle 10,000 concurrent connections
Answer:Process-per-connection (not recommended):
- 10,000 processes = massive memory overhead
- Context switch overhead kills performance
- Better but still problematic at 10K
- Stack memory: 10K × 8MB = 80GB virtual memory
- Thread switching overhead
- Single thread handles many connections
- Use
epoll_wait()to multiplex I/O - Non-blocking I/O for all sockets
- Multiple worker processes (CPU count)
- Each uses event loop for many connections
- Examples: Nginx, Node.js cluster
Practice Exercises
1
Fork Chain
Write a program that creates a chain of N processes (each child creates one grandchild). Print the process tree.
2
Zombie Factory
Create a program that generates zombies, then use
ps to observe them. Implement proper cleanup.3
Measure Context Switch
Use pipes between two processes to measure context switch time by rapidly passing a token back and forth.
4
Custom Shell
Implement a simple shell that can run commands, handle pipes, and manage background processes.
Key Takeaways
Process = Execution Context
PCB contains everything kernel needs: state, memory, files, credentials
Fork + Exec
Unix model: copy then transform. COW makes fork cheap.
Context Switch Cost
Direct cost + cache/TLB effects. Minimize switches for performance.
Zombie/Orphan Handling
Always reap children. Orphans adopted by init.
Next: Threads & Concurrency →