Skip to main content

Operating Systems Case Studies

Learn from real-world examples of OS concepts applied in production systems. These case studies demonstrate how theory meets practice.
Purpose: Connect theory to real systems
Target: Senior engineers preparing for system design
Approach: Analysis of actual production incidents and design decisions

Case Study 1: Chrome’s Multi-Process Architecture

Background

Chrome runs each tab in a separate process. Why?

Problem

Before (single-process browsers):
  • One tab crash = entire browser crash
  • Malicious site can access other tabs’ data
  • Memory leaks accumulate
  • No parallelism across cores

Solution

┌─────────────────────────────────────────────────────────────────┐
│                    CHROME ARCHITECTURE                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   ┌─────────────────────────────────────────────────────────┐   │
│   │              Browser Process (privileged)               │   │
│   │  • UI, network, storage, disk access                    │   │
│   │  • Manages all other processes                          │   │
│   │  • Single instance                                       │   │
│   └───────────────────────────┬─────────────────────────────┘   │
│                               │ IPC (Mojo)                       │
│           ┌───────────────────┼───────────────────┐              │
│           │                   │                   │              │
│           ▼                   ▼                   ▼              │
│   ┌───────────────┐   ┌───────────────┐   ┌───────────────┐     │
│   │   Renderer    │   │   Renderer    │   │   Renderer    │     │
│   │   (Tab 1)     │   │   (Tab 2)     │   │   (Tab 3)     │     │
│   │               │   │               │   │               │     │
│   │ • Sandboxed   │   │ • Sandboxed   │   │ • Sandboxed   │     │
│   │ • No disk     │   │ • No disk     │   │ • No disk     │     │
│   │ • No network  │   │ • No network  │   │ • No network  │     │
│   │   (directly)  │   │   (directly)  │   │   (directly)  │     │
│   └───────────────┘   └───────────────┘   └───────────────┘     │
│                                                                  │
│   Additional Processes:                                          │
│   • GPU Process: Hardware acceleration                          │
│   • Plugin Processes: Flash, etc. (sandboxed)                   │
│   • Utility Processes: Audio, network service                   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

OS Concepts Applied

  1. Process Isolation: Each renderer is a separate process
    • Own address space
    • Crash doesn’t affect others
    • Memory limits per tab
  2. Sandboxing: Renderers have minimal privileges
    • Seccomp filters: ~70 allowed syscalls (out of 300+)
    • No file system access
    • No network access (must ask browser process)
    • Namespaces for isolation
  3. IPC: Mojo framework
    • Message passing between processes
    • Shared memory for large data (bitmaps)
    • File descriptor passing

Tradeoffs

AspectMulti-ProcessSingle-Process
MemoryHigher (duplicate libraries)Lower
CPUContext switch overheadNone
SecurityExcellentPoor
StabilityTab crash isolatedBrowser crash
ComplexityHighLow

Lesson

Security and stability often outweigh memory/CPU costs for user-facing applications.

Case Study 2: Mars Pathfinder Priority Inversion

Background

July 1997: Mars Pathfinder landed on Mars. Days later, it started randomly resetting.

Problem

┌─────────────────────────────────────────────────────────────────┐
│                    PATHFINDER TASKS                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   High Priority: bc_dist                                        │
│   - Bus distribution task                                        │
│   - Must run frequently                                          │
│   - Uses shared bus via mutex                                   │
│                                                                  │
│   Medium Priority: Various tasks                                │
│   - Image processing                                             │
│   - Data logging                                                 │
│                                                                  │
│   Low Priority: Meteorological data collection                  │
│   - Takes bus mutex for long time                               │
│   - Reads sensors                                                │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Timeline of bug:
┌────────────────────────────────────────────────────────────────┐
│                                                                 │
│  Time   Action                                                  │
│  ─────  ───────────────────────────────────────────────────    │
│  T+0    Low priority (L) acquires bus mutex                    │
│  T+1    High priority (H) wakes up, needs mutex, BLOCKS        │
│  T+2    Medium priority (M) wakes up, preempts L               │
│  T+3    M runs... and runs... and runs...                      │
│  T+4    H is still waiting (for L, which can't run)            │
│  T+5    Watchdog timer fires → SYSTEM RESET!                   │
│                                                                 │
│  Problem: H is waiting for L, but M (lower than H) runs        │
│  This is PRIORITY INVERSION                                     │
│                                                                 │
└────────────────────────────────────────────────────────────────┘

Solution

Priority Inheritance Protocol:
  • When H blocks on mutex held by L
  • L temporarily inherits H’s priority
  • L runs (not preempted by M)
  • L releases mutex
  • H runs
  • L returns to original priority

Implementation

// VxWorks RTOS (used on Pathfinder)
// The fix was a configuration flag that was OFF by default!

// Enable priority inheritance on mutex
semMCreate(SEM_Q_PRIORITY | SEM_INVERSION_SAFE);
//                          ^^^^^^^^^^^^^^^^
//                          This was missing!

Remote Debug

The amazing part: NASA debugged this from 119 million miles away:
  1. Analyzed telemetry showing reset patterns
  2. Reproduced on ground hardware
  3. Identified priority inversion via traces
  4. Uploaded patch to enable priority inheritance
  5. Problem solved!

Lesson

  1. Test real-time constraints under load
  2. Enable safety features even if they have overhead
  3. Instrument everything for post-mortem analysis
  4. Design for remote debugging

Case Study 3: Cloudflare Outage (Regex Backtracking)

Background

July 2, 2019: Cloudflare experienced a 27-minute global outage.

Problem

A regex in their Web Application Firewall (WAF) caused catastrophic backtracking:
(?:(?:\"|'|\]|\}|\\|\d|(?:nan|infinity|true|false|null|undefined|symbol|math)|\`|\-|\+)+[)]*;?((?:\s|-|~|!|{}|\|\||\+)*.*(?:.*=.*)))
When this regex encountered certain input:
  • CPU usage spiked to 100%
  • Worker processes became unresponsive
  • Edge servers stopped responding
  • Global outage

Why It Happened

┌─────────────────────────────────────────────────────────────────┐
│                    REGEX BACKTRACKING                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   Regex: .*.*=.*                                                │
│   Input: "xxxxxxxxxxxxxxxxxxxxxxxxxx"                           │
│                                                                  │
│   First .* matches all of "xxx..."                              │
│   Second .* can't match anything, backtrack                     │
│   First .* matches one less, try second .* again                │
│   Keep backtracking... exponential combinations!                │
│                                                                  │
│   Complexity: O(2^n) for n characters                           │
│                                                                  │
│   n=10:  1,024 operations                                       │
│   n=20:  1,048,576 operations                                   │
│   n=30:  1,073,741,824 operations → CPU locked                  │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

OS/Systems Lessons

  1. No timeout on regex execution
    • Process ran indefinitely
    • Should have CPU time limits
  2. Insufficient isolation
    • Bad regex affected all traffic
    • Should have per-request resource limits
  3. Cascading failures
    • Retry storms made it worse
    • Should have circuit breakers

Fixed By

  1. Immediate: Reverted the WAF rule
  2. Short-term: Added regex timeout (Lua)
  3. Long-term:
    • Moved to re2 (guaranteed linear time)
    • Added automated regex complexity analysis
    • Staged rollouts with monitoring

Implementation

-- Before: No protection
local match = ngx.re.match(input, pattern)

-- After: With timeout using pcre_extra limits
local match = ngx.re.match(input, pattern, "jo", nil, 1000)
--                                                    ^^^^
--                                          match_limit: max backtracking

Lesson

Always bound CPU time for untrusted input processing. Use:
  • cgroups for CPU limits
  • Timeouts on operations
  • Algorithms with guaranteed complexity

Case Study 4: Linux Kernel OOM Killer

Background

When Linux runs out of memory, the OOM (Out of Memory) Killer terminates processes to free memory.

Problem Scenario

┌─────────────────────────────────────────────────────────────────┐
│                    OOM SITUATION                                 │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   Memory Usage:                                                  │
│   ┌─────────────────────────────────────────────────────────┐   │
│   │████████████████████████████████████████████████████████│   │
│   │            Used: 15.8 GB / 16 GB                       │   │
│   └─────────────────────────────────────────────────────────┘   │
│                                                                  │
│   Swap:                                                          │
│   ┌─────────────────────────────────────────────────────────┐   │
│   │████████████████████████████████████████████████████████│   │
│   │            Used: 4 GB / 4 GB (FULL!)                   │   │
│   └─────────────────────────────────────────────────────────┘   │
│                                                                  │
│   New allocation request comes in...                            │
│   No memory available!                                           │
│                                                                  │
│   Options:                                                       │
│   1. Fail the allocation → Process crashes anyway               │
│   2. Kill a process to free memory → OOM Killer!               │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

OOM Killer Algorithm

// Simplified scoring algorithm
// Higher score = more likely to be killed

oom_score = memory_usage / total_memory * 1000;

// Adjustments:
// - Root processes: slightly lower score
// - Long-running processes: slightly higher score
// - User adjustment: oom_score_adj (-1000 to +1000)

// Score of 0 or oom_score_adj of -1000 = immune

Real Incident

Scenario: Production database server running out of memory.
# dmesg output:
[10854.231] Out of memory: Killed process 8234 (postgres) 
            total-vm:7234512kB, anon-rss:6891234kB, file-rss:1234kB

# What happened:
# 1. A runaway query consumed excessive memory
# 2. System couldn't allocate for other processes
# 3. OOM killer chose postgres (highest memory user)
# 4. Database terminated, service outage

Prevention Strategies

# 1. Make critical processes immune
echo -1000 > /proc/$(pgrep postgres)/oom_score_adj

# 2. Limit memory at cgroup level
echo 8G > /sys/fs/cgroup/memory/postgres/memory.max

# 3. Disable overcommit (strict mode)
echo 2 > /proc/sys/vm/overcommit_memory
echo 80 > /proc/sys/vm/overcommit_ratio

# 4. Add swap (buys time)
fallocate -l 4G /swapfile
mkswap /swapfile
swapon /swapfile

# 5. Monitor and alert before OOM
# Set up alerts at 80% memory usage

Better Approach

┌─────────────────────────────────────────────────────────────────┐
│                    PROPER MEMORY MANAGEMENT                      │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   1. Application-level limits                                    │
│      • PostgreSQL: shared_buffers, work_mem limits              │
│      • JVM: -Xmx heap limit                                     │
│      • Go: GOMEMLIMIT                                           │
│                                                                  │
│   2. Container/cgroup limits                                     │
│      • Kubernetes: resources.limits.memory                      │
│      • Docker: --memory flag                                    │
│                                                                  │
│   3. Systemd service limits                                      │
│      • MemoryMax=8G in unit file                               │
│                                                                  │
│   4. Graceful degradation                                       │
│      • Reject new connections at 80%                            │
│      • Drop caches at 90%                                       │
│      • Circuit breaker at 95%                                   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Lesson

Don’t rely on OOM Killer — it’s a last resort. Instead:
  • Set appropriate memory limits
  • Monitor and alert
  • Design for graceful degradation

Case Study 5: Docker Fork Bomb Prevention

Problem

A container runs a fork bomb, potentially taking down the host:
# Classic fork bomb
:(){ :|:& };:

# This creates exponential processes
# 2^n processes very quickly
# Can exhaust PIDs, file descriptors, memory

Without Protection

┌─────────────────────────────────────────────────────────────────┐
│                    FORK BOMB IMPACT                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   Second 0:   1 process                                         │
│   Second 1:   2 processes                                       │
│   Second 2:   4 processes                                       │
│   Second 3:   8 processes                                       │
│   Second 4:   16 processes                                      │
│   Second 5:   32 processes                                      │
│   ...                                                           │
│   Second 15:  32,768 processes → PID limit hit!                │
│                                                                  │
│   Effects:                                                       │
│   • Can't create new processes (even ssh!)                      │
│   • System becomes unresponsive                                 │
│   • Other containers affected                                    │
│   • May require hard reboot                                      │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Solution: PID Cgroups

# Limit PIDs per container
docker run --pids-limit 100 myimage

# Manually via cgroups:
echo 100 > /sys/fs/cgroup/pids/docker/<container_id>/pids.max

Complete Container Hardening

# docker-compose.yml
version: "3.9"
services:
  myapp:
    image: myapp:latest
    deploy:
      resources:
        limits:
          cpus: "2"
          memory: 4G
          pids: 100
    security_opt:
      - no-new-privileges:true
      - seccomp:custom-profile.json
    cap_drop:
      - ALL
    cap_add:
      - NET_BIND_SERVICE
    read_only: true
    tmpfs:
      - /tmp:size=100M

Kubernetes Pod Security

apiVersion: v1
kind: Pod
metadata:
  name: secure-pod
spec:
  containers:
  - name: app
    image: myapp
    resources:
      limits:
        memory: "4Gi"
        cpu: "2"
        # PID limits via LimitRange
    securityContext:
      runAsNonRoot: true
      readOnlyRootFilesystem: true
      allowPrivilegeEscalation: false
      capabilities:
        drop:
          - ALL

Lesson

Defense in depth for containers:
  1. PID limits (fork bombs)
  2. Memory limits (memory bombs)
  3. CPU limits (CPU bombs)
  4. Seccomp (syscall filtering)
  5. Capability dropping
  6. Read-only filesystem
  7. Non-root user

Summary: Key Lessons

Isolation is Worth It

Chrome proves process isolation’s value despite memory overhead.

Priority Inversion is Real

Mars Pathfinder shows subtle bugs can have catastrophic effects.

Bound All Operations

Cloudflare regex outage: always limit CPU time for untrusted input.

Don't Trust OOM Killer

Set proper limits; OOM Killer is a last resort, not a strategy.

Practice Exercise

Design a container runtime that:
  1. Isolates processes (namespaces)
  2. Limits resources (cgroups)
  3. Filters syscalls (seccomp)
  4. Survives fork bombs
  5. Handles OOM gracefully
Consider:
  • What limits would you set by default?
  • How would you detect resource abuse?
  • How would you alert operators?
  • How would you handle cleanup?

← Back to Overview