Operating Systems Case Studies
Learn from real-world examples of OS concepts applied in production systems. These case studies demonstrate how theory meets practice.
Purpose : Connect theory to real systems
Target : Senior engineers preparing for system design
Approach : Analysis of actual production incidents and design decisions
Case Study 1: Chrome’s Multi-Process Architecture
Background
Chrome runs each tab in a separate process. Why?
Problem
Before (single-process browsers) :
One tab crash = entire browser crash
Malicious site can access other tabs’ data
Memory leaks accumulate
No parallelism across cores
Solution
┌─────────────────────────────────────────────────────────────────┐
│ CHROME ARCHITECTURE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Browser Process (privileged) │ │
│ │ • UI, network, storage, disk access │ │
│ │ • Manages all other processes │ │
│ │ • Single instance │ │
│ └───────────────────────────┬─────────────────────────────┘ │
│ │ IPC (Mojo) │
│ ┌───────────────────┼───────────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │
│ │ Renderer │ │ Renderer │ │ Renderer │ │
│ │ (Tab 1) │ │ (Tab 2) │ │ (Tab 3) │ │
│ │ │ │ │ │ │ │
│ │ • Sandboxed │ │ • Sandboxed │ │ • Sandboxed │ │
│ │ • No disk │ │ • No disk │ │ • No disk │ │
│ │ • No network │ │ • No network │ │ • No network │ │
│ │ (directly) │ │ (directly) │ │ (directly) │ │
│ └───────────────┘ └───────────────┘ └───────────────┘ │
│ │
│ Additional Processes: │
│ • GPU Process: Hardware acceleration │
│ • Plugin Processes: Flash, etc. (sandboxed) │
│ • Utility Processes: Audio, network service │
│ │
└─────────────────────────────────────────────────────────────────┘
OS Concepts Applied
Process Isolation : Each renderer is a separate process
Own address space
Crash doesn’t affect others
Memory limits per tab
Sandboxing : Renderers have minimal privileges
Seccomp filters: ~70 allowed syscalls (out of 300+)
No file system access
No network access (must ask browser process)
Namespaces for isolation
IPC : Mojo framework
Message passing between processes
Shared memory for large data (bitmaps)
File descriptor passing
Tradeoffs
Aspect Multi-Process Single-Process Memory Higher (duplicate libraries) Lower CPU Context switch overhead None Security Excellent Poor Stability Tab crash isolated Browser crash Complexity High Low
Lesson
Security and stability often outweigh memory/CPU costs for user-facing applications.
Case Study 2: Mars Pathfinder Priority Inversion
Background
July 1997: Mars Pathfinder landed on Mars. Days later, it started randomly resetting.
Problem
┌─────────────────────────────────────────────────────────────────┐
│ PATHFINDER TASKS │
├─────────────────────────────────────────────────────────────────┤
│ │
│ High Priority: bc_dist │
│ - Bus distribution task │
│ - Must run frequently │
│ - Uses shared bus via mutex │
│ │
│ Medium Priority: Various tasks │
│ - Image processing │
│ - Data logging │
│ │
│ Low Priority: Meteorological data collection │
│ - Takes bus mutex for long time │
│ - Reads sensors │
│ │
└─────────────────────────────────────────────────────────────────┘
Timeline of bug:
┌────────────────────────────────────────────────────────────────┐
│ │
│ Time Action │
│ ───── ─────────────────────────────────────────────────── │
│ T+0 Low priority (L) acquires bus mutex │
│ T+1 High priority (H) wakes up, needs mutex, BLOCKS │
│ T+2 Medium priority (M) wakes up, preempts L │
│ T+3 M runs... and runs... and runs... │
│ T+4 H is still waiting (for L, which can't run) │
│ T+5 Watchdog timer fires → SYSTEM RESET! │
│ │
│ Problem: H is waiting for L, but M (lower than H) runs │
│ This is PRIORITY INVERSION │
│ │
└────────────────────────────────────────────────────────────────┘
Solution
Priority Inheritance Protocol :
When H blocks on mutex held by L
L temporarily inherits H’s priority
L runs (not preempted by M)
L releases mutex
H runs
L returns to original priority
Implementation
// VxWorks RTOS (used on Pathfinder)
// The fix was a configuration flag that was OFF by default!
// Enable priority inheritance on mutex
semMCreate (SEM_Q_PRIORITY | SEM_INVERSION_SAFE);
// ^^^^^^^^^^^^^^^^
// This was missing!
Remote Debug
The amazing part: NASA debugged this from 119 million miles away :
Analyzed telemetry showing reset patterns
Reproduced on ground hardware
Identified priority inversion via traces
Uploaded patch to enable priority inheritance
Problem solved!
Lesson
Test real-time constraints under load
Enable safety features even if they have overhead
Instrument everything for post-mortem analysis
Design for remote debugging
Case Study 3: Cloudflare Outage (Regex Backtracking)
Background
July 2, 2019: Cloudflare experienced a 27-minute global outage.
Problem
A regex in their Web Application Firewall (WAF) caused catastrophic backtracking:
(?:(?: \" | ' | \] | \} | \\ | \d | (?: nan | infinity | true | false | null | undefined | symbol | math ) | \` | \- | \+ ) + [ ) ] * ; ? ((?: \s | - | ~ | ! | {} | \|\| | \+ ) * . * (?: . * = . * )))
When this regex encountered certain input:
CPU usage spiked to 100%
Worker processes became unresponsive
Edge servers stopped responding
Global outage
Why It Happened
┌─────────────────────────────────────────────────────────────────┐
│ REGEX BACKTRACKING │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Regex: .*.*=.* │
│ Input: "xxxxxxxxxxxxxxxxxxxxxxxxxx" │
│ │
│ First .* matches all of "xxx..." │
│ Second .* can't match anything, backtrack │
│ First .* matches one less, try second .* again │
│ Keep backtracking... exponential combinations! │
│ │
│ Complexity: O(2^n) for n characters │
│ │
│ n=10: 1,024 operations │
│ n=20: 1,048,576 operations │
│ n=30: 1,073,741,824 operations → CPU locked │
│ │
└─────────────────────────────────────────────────────────────────┘
OS/Systems Lessons
No timeout on regex execution
Process ran indefinitely
Should have CPU time limits
Insufficient isolation
Bad regex affected all traffic
Should have per-request resource limits
Cascading failures
Retry storms made it worse
Should have circuit breakers
Fixed By
Immediate : Reverted the WAF rule
Short-term : Added regex timeout (Lua)
Long-term :
Moved to re2 (guaranteed linear time)
Added automated regex complexity analysis
Staged rollouts with monitoring
Implementation
-- Before: No protection
local match = ngx . re . match ( input , pattern )
-- After: With timeout using pcre_extra limits
local match = ngx . re . match ( input , pattern , "jo" , nil , 1000 )
-- ^^^^
-- match_limit: max backtracking
Lesson
Always bound CPU time for untrusted input processing. Use:
cgroups for CPU limits
Timeouts on operations
Algorithms with guaranteed complexity
Case Study 4: Linux Kernel OOM Killer
Background
When Linux runs out of memory, the OOM (Out of Memory) Killer terminates processes to free memory.
Problem Scenario
┌─────────────────────────────────────────────────────────────────┐
│ OOM SITUATION │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Memory Usage: │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │████████████████████████████████████████████████████████│ │
│ │ Used: 15.8 GB / 16 GB │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ Swap: │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │████████████████████████████████████████████████████████│ │
│ │ Used: 4 GB / 4 GB (FULL!) │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ New allocation request comes in... │
│ No memory available! │
│ │
│ Options: │
│ 1. Fail the allocation → Process crashes anyway │
│ 2. Kill a process to free memory → OOM Killer! │
│ │
└─────────────────────────────────────────────────────────────────┘
OOM Killer Algorithm
// Simplified scoring algorithm
// Higher score = more likely to be killed
oom_score = memory_usage / total_memory * 1000 ;
// Adjustments:
// - Root processes: slightly lower score
// - Long-running processes: slightly higher score
// - User adjustment: oom_score_adj (-1000 to +1000)
// Score of 0 or oom_score_adj of -1000 = immune
Real Incident
Scenario : Production database server running out of memory.
# dmesg output:
[10854.231] Out of memory: Killed process 8234 ( postgres )
total-vm:7234512kB, anon-rss:6891234kB, file-rss:1234kB
# What happened:
# 1. A runaway query consumed excessive memory
# 2. System couldn't allocate for other processes
# 3. OOM killer chose postgres (highest memory user)
# 4. Database terminated, service outage
Prevention Strategies
# 1. Make critical processes immune
echo -1000 > /proc/ $( pgrep postgres ) /oom_score_adj
# 2. Limit memory at cgroup level
echo 8G > /sys/fs/cgroup/memory/postgres/memory.max
# 3. Disable overcommit (strict mode)
echo 2 > /proc/sys/vm/overcommit_memory
echo 80 > /proc/sys/vm/overcommit_ratio
# 4. Add swap (buys time)
fallocate -l 4G /swapfile
mkswap /swapfile
swapon /swapfile
# 5. Monitor and alert before OOM
# Set up alerts at 80% memory usage
Better Approach
┌─────────────────────────────────────────────────────────────────┐
│ PROPER MEMORY MANAGEMENT │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 1. Application-level limits │
│ • PostgreSQL: shared_buffers, work_mem limits │
│ • JVM: -Xmx heap limit │
│ • Go: GOMEMLIMIT │
│ │
│ 2. Container/cgroup limits │
│ • Kubernetes: resources.limits.memory │
│ • Docker: --memory flag │
│ │
│ 3. Systemd service limits │
│ • MemoryMax=8G in unit file │
│ │
│ 4. Graceful degradation │
│ • Reject new connections at 80% │
│ • Drop caches at 90% │
│ • Circuit breaker at 95% │
│ │
└─────────────────────────────────────────────────────────────────┘
Lesson
Don’t rely on OOM Killer — it’s a last resort. Instead:
Set appropriate memory limits
Monitor and alert
Design for graceful degradation
Case Study 5: Docker Fork Bomb Prevention
Problem
A container runs a fork bomb, potentially taking down the host:
# Classic fork bomb
: (){ : | : & }; :
# This creates exponential processes
# 2^n processes very quickly
# Can exhaust PIDs, file descriptors, memory
Without Protection
┌─────────────────────────────────────────────────────────────────┐
│ FORK BOMB IMPACT │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Second 0: 1 process │
│ Second 1: 2 processes │
│ Second 2: 4 processes │
│ Second 3: 8 processes │
│ Second 4: 16 processes │
│ Second 5: 32 processes │
│ ... │
│ Second 15: 32,768 processes → PID limit hit! │
│ │
│ Effects: │
│ • Can't create new processes (even ssh!) │
│ • System becomes unresponsive │
│ • Other containers affected │
│ • May require hard reboot │
│ │
└─────────────────────────────────────────────────────────────────┘
Solution: PID Cgroups
# Limit PIDs per container
docker run --pids-limit 100 myimage
# Manually via cgroups:
echo 100 > /sys/fs/cgroup/pids/docker/ < container_i d > /pids.max
Complete Container Hardening
# docker-compose.yml
version : "3.9"
services :
myapp :
image : myapp:latest
deploy :
resources :
limits :
cpus : "2"
memory : 4G
pids : 100
security_opt :
- no-new-privileges:true
- seccomp:custom-profile.json
cap_drop :
- ALL
cap_add :
- NET_BIND_SERVICE
read_only : true
tmpfs :
- /tmp:size=100M
Kubernetes Pod Security
apiVersion : v1
kind : Pod
metadata :
name : secure-pod
spec :
containers :
- name : app
image : myapp
resources :
limits :
memory : "4Gi"
cpu : "2"
# PID limits via LimitRange
securityContext :
runAsNonRoot : true
readOnlyRootFilesystem : true
allowPrivilegeEscalation : false
capabilities :
drop :
- ALL
Lesson
Defense in depth for containers :
PID limits (fork bombs)
Memory limits (memory bombs)
CPU limits (CPU bombs)
Seccomp (syscall filtering)
Capability dropping
Read-only filesystem
Non-root user
Summary: Key Lessons
Isolation is Worth It Chrome proves process isolation’s value despite memory overhead.
Priority Inversion is Real Mars Pathfinder shows subtle bugs can have catastrophic effects.
Bound All Operations Cloudflare regex outage: always limit CPU time for untrusted input.
Don't Trust OOM Killer Set proper limits; OOM Killer is a last resort, not a strategy.
Practice Exercise
Design a container runtime that:
Isolates processes (namespaces)
Limits resources (cgroups)
Filters syscalls (seccomp)
Survives fork bombs
Handles OOM gracefully
Consider:
What limits would you set by default?
How would you detect resource abuse?
How would you alert operators?
How would you handle cleanup?
← Back to Overview