Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Operating Systems Interview Preparation
This guide covers the most frequently asked OS interview questions at top tech companies, organized by topic with detailed answers and common follow-ups.Companies: FAANG, startups, systems companies
Preparation Time: 20-30 hours across all topics
Interview Question Patterns
Learning Tracks and Progression
Use this section as a roadmap for practicing OS questions based on your goals.Track 1: Generalist / Application Engineer
Focus on being dangerously good at fundamentals:- Read: Processes & Threads, Virtual Memory, Synchronization, Scheduling.
- Practice:
- Explain process vs thread vs coroutine in your own words.
- Walk through
malloc→ page tables → page faults. - Debug simple deadlock and starvation examples.
- Goal: Comfortably answer most “Top 20” OS questions in this file.
Track 2: Systems / Infra / Backend Engineer
You own services in production and need to debug OS-level issues:- Read: Everything in Track 1 plus File Systems, I/O Systems, Networking, Deadlocks, Linux Internals.
- Practice:
- Trace a slow web request through CPU, memory, and I/O using tools (
strace,perf,iostat,ss). - Design a thread pool and explain scheduler interactions.
- Reason about cgroups, namespaces, and container isolation.
- Trace a slow web request through CPU, memory, and I/O using tools (
- Goal: Confidently debug “server is slow / high CPU / high IO wait” incidents.
Track 3: Kernel / Low-level Engineer
You want to work on kernels, drivers, or high-performance runtimes:- Read: All OS chapters, especially CPU Architectures, Kernel Memory, Linux Internals, Device Drivers, Storage Stack, Security, eBPF.
- Practice:
- Read and explain small sections of real kernel code (e.g.,
do_page_fault,tcp_recvmsg). - Implement simple kernel modules, experiment with scheduling and memory policies.
- Design lock hierarchies and reason about RCU and lock-free algorithms.
- Read and explain small sections of real kernel code (e.g.,
- Goal: Handle deep-dive interviews where you whiteboard OS internals and read code live.
Top 20 OS Interview Questions
Process & Threading
1. What's the difference between a process and a thread?
1. What's the difference between a process and a thread?
| Aspect | Process | Thread |
|---|---|---|
| Memory | Separate address space | Shared address space |
| Creation | Expensive (fork) | Cheap |
| Context switch | Expensive (TLB flush) | Cheap |
| Communication | IPC needed | Shared memory |
| Crash impact | Isolated | Affects all threads |
- Processes: Isolation needed (security), different languages, crash isolation
- Threads: Shared state, low latency communication, same codebase
- Creates copy of parent’s address space (copy-on-write)
- Child gets PID, returns 0 from fork()
- Parent gets child’s PID, returns child PID from fork()
2. What happens when you run a program?
2. What happens when you run a program?
- Shell parses command, finds executable in PATH
- fork(): Create child process
- Copy page tables (COW)
- Copy file descriptors
- New PID, same code
- execve(): Replace child with new program
- Load ELF headers
- Set up new address space
- Map code, data sections
- Set up stack with args/env
- Jump to _start (entry point)
- Dynamic linking: ld.so loads shared libraries
- main(): C runtime calls your main()
- exit(): Cleanup, return status to parent
- wait(): Parent reaps child, gets exit status
3. Explain context switching
3. Explain context switching
- CPU registers (general purpose, PC, SP)
- Floating point/SIMD registers
- Kernel stack pointer
- Page table pointer (CR3 on x86)
- Thread switch: ~1-2 μs
- Process switch: ~5-10 μs (TLB flush)
Memory Management
4. Explain virtual memory
4. Explain virtual memory
- Process isolation
- Memory overcommit
- Demand paging (don’t load until needed)
- Shared libraries (one physical copy, many mappings)
- Easy relocation
- Page tables (software)
- MMU (hardware)
- TLB (cache for page table lookups)
5. What happens during a page fault?
5. What happens during a page fault?
- Minor fault: Page in memory, just needs mapping
- Major fault: Page on disk, I/O required
- Invalid: Segmentation fault
- Demand paging (first access)
- Copy-on-write
- Stack growth
- Swapped out page
- Null pointer dereference (→ SIGSEGV)
6. How does malloc work?
6. How does malloc work?
- Come from pre-allocated heap
- Free list management
- May use sbrk() to extend heap
- Use mmap() directly
- Return to OS on free
- ptmalloc: Per-thread arenas, reduces contention
- jemalloc: Better for multi-threaded, used by Firefox
- tcmalloc: Thread-caching, used by Go
- Small: Returns to free list (not to OS)
- Large (mmap’d): munmap() returns to OS
- May trigger coalescing of free blocks
Synchronization
7. What is a deadlock? How do you prevent it?
7. What is a deadlock? How do you prevent it?
- Mutual exclusion: Resource can’t be shared
- Hold and wait: Holding one, waiting for another
- No preemption: Can’t forcibly take resource
- Circular wait: A→B→C→A waiting cycle
| Strategy | Method | Tradeoff |
|---|---|---|
| Lock ordering | Always acquire in same order | Requires discipline |
| Lock timeout | Give up after timeout | May fail needlessly |
| Try-lock | Non-blocking acquire | Retry logic needed |
| Single lock | One lock for all | Poor concurrency |
| Lock-free | Atomic operations only | Complex to implement |
8. Explain mutex vs semaphore vs condition variable
8. Explain mutex vs semaphore vs condition variable
| Primitive | Purpose | Count | Use Case |
|---|---|---|---|
| Mutex | Mutual exclusion | 0/1 | Protect critical section |
| Semaphore | Resource counting | 0-N | Limit concurrent access |
| Cond Var | Wait for condition | N/A | Producer-consumer |
9. What is a spinlock? When to use it?
9. What is a spinlock? When to use it?
- Very short critical sections (less than 1μs)
- Interrupt handlers (can’t sleep)
- Lock held time shorter than context switch time
- Known low contention
- Long critical sections
- High contention
- User space (usually)
- Single CPU system
| Aspect | Spinlock | Mutex |
|---|---|---|
| Waiting | Busy-wait | Sleep |
| CPU use | 100% while waiting | 0% while waiting |
| Context switch | None | Yes |
| Best for | Very short holds | Longer holds |
| Kernel use | Very common | Less common |
Scheduling
10. Explain CPU scheduling algorithms
10. Explain CPU scheduling algorithms
| Algorithm | Description | Pros | Cons |
|---|---|---|---|
| FCFS | First come first served | Simple | Convoy effect |
| SJF | Shortest job first | Optimal avg wait | Need to know times |
| Round Robin | Time slices | Fair | Context switch overhead |
| Priority | Based on priority | Important tasks first | Starvation |
| Multilevel | Multiple queues | Flexible | Complex |
| CFS | Fair share of CPU | Fair, no starvation | Overhead |
- Each task tracks “virtual runtime” (vruntime)
- Lower vruntime = hasn’t had fair share = run it next
- Red-black tree for O(log n) task selection
- Nice values adjust time slices, not priority
- Throughput: Jobs completed per time
- Turnaround: Submit to completion
- Wait time: Time in ready queue
- Response time: Submit to first run
11. What is priority inversion? How do you solve it?
11. What is priority inversion? How do you solve it?
-
Priority Inheritance:
- L temporarily gets H’s priority while holding lock
- L runs, releases lock, H continues
- Used in Linux (rt_mutex)
-
Priority Ceiling:
- Lock has ceiling = max priority of any user
- Acquiring task gets ceiling priority
- Prevents other tasks from preempting
- Low-priority task held bus mutex
- High-priority task blocked
- Watchdog timer triggered reset
- Fixed by enabling priority inheritance
File Systems & I/O
12. What happens when you read a file?
12. What happens when you read a file?
- Page cache (avoid disk for repeated reads)
- Read-ahead (prefetch next blocks)
- I/O merging (combine adjacent requests)
13. Explain the difference between sync and async I/O
13. Explain the difference between sync and async I/O
| Aspect | Sync | Async (AIO) | io_uring |
|---|---|---|---|
| Blocking | Yes | No | No |
| System calls | 1 per I/O | 2 per I/O | Batched |
| Complexity | Simple | Complex | Moderate |
| Performance | OK | Better | Best |
Linux Specific
14. What is the difference between select, poll, and epoll?
14. What is the difference between select, poll, and epoll?
| Aspect | select | poll | epoll |
|---|---|---|---|
| Max FDs | 1024 (FD_SETSIZE) | Unlimited | Unlimited |
| Passing FDs | Copy each call | Copy each call | Register once |
| Checking | O(n) scan | O(n) scan | O(1) ready list |
| Edge trigger | No | No | Yes |
- High-performance servers
- Many connections
- Production systems
15. Explain containers (namespaces + cgroups)
15. Explain containers (namespaces + cgroups)
| Namespace | Isolates |
|---|---|
| PID | Process IDs (PID 1 inside container) |
| NET | Network stack, interfaces |
| MNT | Filesystem mounts |
| UTS | Hostname |
| IPC | Shared memory, semaphores |
| USER | UID/GID mapping |
| CGROUP | Cgroup root |
| Controller | Limits |
|---|---|
| memory | RAM usage |
| cpu | CPU time |
| blkio | Disk I/O |
| pids | Number of processes |
Common Design Questions
16. Design a thread pool
16. Design a thread pool
- Fixed number of worker threads
- Task queue
- Graceful shutdown
- How to handle task priorities? Use priority queue
- How to handle task cancellation? Cancellation tokens
- How to tune thread count? CPU cores, task type (I/O vs CPU)
17. Implement a simple memory allocator
17. Implement a simple memory allocator
- Coalescing free blocks
- Size classes (slab-like)
- Per-thread caches
- Best fit or other strategies
18. Design a rate limiter
18. Design a rate limiter
- Use Redis for shared state
- Approximate algorithms (e.g., sliding window log)
- Accept some inconsistency for performance
Debugging Scenarios
19. This program hangs. Diagnose it.
19. This program hangs. Diagnose it.
-
Attach debugger / get stack traces:
-
Check for deadlock:
-
Check for infinite loop:
-
Check for I/O block:
-
Check for network:
- Deadlock (lock order violation)
- Network timeout too high
- Database connection exhausted
- Full disk / slow I/O
- Signal handling issue
20. This server is slow. Diagnose it.
20. This server is slow. Diagnose it.
-
CPU check:
-
Memory check:
-
Disk I/O check:
-
Network check:
-
Application profiling:
-
System calls:
- Lock contention
- Database queries
- Network latency
- GC pauses (Java, Go)
- Disk I/O (logging, temp files)
- Connection exhaustion
Quick Reference Cheat Sheet
Key Numbers to Know
| Metric | Approximate Value |
|---|---|
| L1 cache access | 1 ns |
| L2 cache access | 4 ns |
| L3 cache access | 12 ns |
| RAM access | 100 ns |
| SSD read | 100 μs |
| HDD seek | 10 ms |
| Context switch | 1-10 μs |
| Page fault (minor) | 5-10 μs |
| Page fault (major) | 1-10 ms |
| System call | 100-200 ns |
System Call Quick Reference
Common File Paths
Study Plan (Original 5-Week Outline)
How to Study OS Interviews — Detailed 30-Day Plan
The 5-week outline above is the skeleton. The 30-day plan below is the muscle: what to actually do each day so that on interview day, you have not just read the material, you have practiced it. The structure mirrors how staff engineers at FAANG actually learn this material — in layers, with deliberate practice, not by passively re-reading textbooks.Week 1: Foundations (Days 1-7)
- Day 1-2: Read the OS Fundamentals and Processes chapters of this course. Re-read the xv6 book chapters on processes (Ch 2-3). Draw the user/kernel boundary diagram from memory three times.
- Day 3: Practice explaining “what happens on
fork()” out loud, to a wall, in 3 minutes. Record yourself. Listen back. Note where you stumble. - Day 4-5: Read Threads and Synchronization chapters. Implement a thread pool from scratch in C or your language of choice. Do not look up reference implementations until you have a working version.
- Day 6: Pick 5 questions from the “Top 20” list above. Answer each one out loud, in writing, and on a whiteboard. Compare your answers to the references.
- Day 7: Rest day or catch-up day. Review what felt shaky.
task_struct, clone() flags), do NOT proceed to week 2. Go back.Week 2: Memory and Processes (Days 8-14)
malloc() end-to-end.- Day 8: Virtual Memory chapter, plus the Kernel Memory chapter. Draw the four-level page table walk (PML4 -> PDPT -> PD -> PT -> page) on paper. Memorize the entry sizes (8 bytes per entry, 512 entries per table).
- Day 9: Memory Management chapter. Implement a simple bump allocator, then a free-list allocator. Hand-trace the heap state through 5 mallocs and 3 frees.
- Day 10: Read Brendan Gregg’s chapter on memory in “Systems Performance” (or watch his free YouTube talks). Run
pmapon a real process. Decode every section. - Day 11: Page faults: minor, major, COW, demand paging. Trace what the kernel does for each. Implement a userfaultfd toy to feel the mechanism.
- Day 12: Containers and cgroups — read the Containers/Virtualization chapter. Set up a Docker container manually using
unshare,mount, andcgcreate. Do not skip this — knowing namespaces and cgroups by hand impresses interviewers immediately. - Day 13-14: Practice all “memory” questions from this guide. Record yourself answering. Watch back, identify weak spots.
Week 3: I/O, Storage, and Filesystems (Days 15-21)
read() to disk. By end of week, you should be able to compare epoll, io_uring, and AIO; explain what a filesystem journal does; and diagnose disk-bound performance.- Day 15: I/O Systems chapter and Storage Stack chapter. Trace a
read()syscall from user space through VFS to the device driver to the block layer. - Day 16: epoll, select, poll, io_uring — the four major Linux I/O multiplexing mechanisms. Compare in writing: when each is appropriate, what their syscall signatures look like, what edge-vs-level-triggered means.
- Day 17: File Systems chapter. Read about ext4 (extents, journaling), XFS (allocation groups), and at least one COW filesystem (btrfs or ZFS). Compare design choices.
- Day 18: Networking chapter. Trace a TCP connection establishment from
connect()syscall through the network stack to the wire. Memorize the TCP state diagram. - Day 19: Performance and debugging. Practice using
iostat,iotop,strace,ftrace, and at least one eBPF tool (biolatency). On a real system, generate disk load and watch the metrics. - Day 20: Practice 5 “I/O is slow” type questions. Walk through diagnosis methodology out loud.
- Day 21: Rest or catch-up. Schedule a mock interview with a friend who knows OS material.
Week 4: Distributed, Security, and Linux Internals (Days 22-30)
- Day 22-23: Linux Internals chapter (this course). Practice the “what happens when you boot Linux” question and “trace a packet through the kernel” question.
- Day 24: Security chapter — capabilities, namespaces, seccomp, SELinux/AppArmor. Read the Spectre/Meltdown papers (or LWN summaries). Understand KPTI.
- Day 25: Synchronization deep-dive: futexes, RCU, lock-free programming. Read Paul McKenney’s “What is RCU, Fundamentally?” series. Implement a SPSC ring buffer in your language of choice.
- Day 26: Case studies: read the OS Case Studies chapter. Pick three (Chrome, Pathfinder, Cloudflare) and rehearse summarizing each in 90 seconds with the key OS lesson.
- Day 27: Modern features (eBPF, io_uring, kernel TLS, etc.). Skim the Modern Features chapter. Try writing a simple bpftrace one-liner.
- Day 28: Mock interview day 1. Have a friend ask 5 random questions from this guide. Time-box answers to 5 minutes each. Review weaknesses.
- Day 29: Mock interview day 2. Practice the “system design” angle: “design a thread-safe rate limiter,” “design a userspace TCP stack,” “design an OS scheduler for a real-time workload.”
- Day 30: Light review. Rest before interview. Re-read your own notes and the cheat sheet. Do NOT cram new material.
Caveats and Common Pitfalls (Interview Strategy)
The biggest reason strong engineers bomb OS interviews is not lack of knowledge — it is bad strategy. The pitfalls below have killed more candidates than weak technical depth.Interview Deep-Dive (Strategy and Approach)
Walk me through how you approach an OS systems-design question you have never seen before. What is your framework?
Walk me through how you approach an OS systems-design question you have never seen before. What is your framework?
- Clarify the scope first (1-2 minutes). Do not start designing until you have asked: what is the workload (read-heavy, write-heavy, mixed)? What is the scale (10 QPS, 100K QPS)? What are the failure modes you care about (durability, availability, latency)? What are the constraints (single machine, distributed, embedded)? Many candidates jump straight to a solution and design for the wrong problem.
- State your assumptions out loud. “I will assume we are designing for a single Linux machine, multi-core, with NVMe storage, optimizing for low p99 latency on reads.” Now the interviewer can correct you if you have the wrong picture.
- Start with the high-level architecture. Draw boxes: client, kernel boundary, your component, dependencies. Label data flow. Do not optimize yet — get the structure right first.
- Identify the OS primitives you will use. Threads vs processes? epoll vs io_uring? Shared memory or message passing? Mutex or RCU? State the choice and the reason for each. This is where senior judgment shows.
- Walk through the happy path end-to-end. “A request arrives. It hits epoll, the worker thread reads from the socket, processes it, and returns the response.” Trace state changes, syscalls, context switches.
- Identify the bottleneck and discuss trade-offs. “The bottleneck will be lock contention on the shared data. I would shard by key, paying memory cost for parallelism.” Always state what you are giving up to gain something.
- Failure modes and recovery. What happens if a worker crashes? If memory fills up? If disk fails? An OS systems-design answer that ignores failure modes is incomplete.
- Wrap up with the trade-offs. “I optimized for read latency at the cost of write amplification. If the workload were write-heavy I would change to design B.” This shows judgment.
- “Let me just sketch a rough design, then we can refine.” Sounds humble but actually means “I am not going to do the structured thinking.” Senior interviewers want to see the framework.
- “I would use Kafka and Kubernetes.” Reaching for off-the-shelf tools without justifying primitives is a junior tell. Even if those tools are right, explain why — which OS-level primitives they provide that you need.
- Brendan Gregg’s “Systems Performance” — not a design book per se but the framework chapters are exemplary.
- “Designing Data-Intensive Applications” by Martin Kleppmann — has explicit design frameworks.
- Google SRE book, “Postmortem Culture” chapter — shows how seniors think about failure modes.
How do you handle a follow-up question you do not know the answer to?
How do you handle a follow-up question you do not know the answer to?
- Buy a moment without panicking. Phrases that work: “That is a good question, let me think for a second.” “I want to make sure I get this right — can I think out loud?” “I have not encountered this exactly, but let me reason about it.” All of these are honest, professional, and buy 5-15 seconds of thinking time. Saying nothing while you think looks like a freeze.
- Reason from primitives. Even if you do not know the specific answer, you almost always know related concepts. Start there: “I do not know the exact answer, but I know X is true and Y is true; that suggests Z…” Reasoning out loud is itself the interview signal. Many interviewers prefer “I do not know the answer but here is how I would derive it” over a memorized correct answer.
- Bound your uncertainty. “I think this is true, but I would want to verify against the kernel source / documentation.” Showing you know what you do not know is a senior trait. Pretending to know what you do not is a junior trait that interviewers detect and disqualify on.
- Redirect to adjacent strength when honest. “I have not used X in production, but in the related case of Y, I have done Z, and I would expect similar trade-offs.” This is honest redirection, not evasion — you are showing relevant experience.
- Admit and offer to learn live. “I do not know — can you share the answer? I would love to understand.” This works occasionally; over-using it makes you look unprepared. Use sparingly for genuinely novel questions.
- “Hmm, I am not sure but I think it is X…” (then a confident wrong answer). Interviewer detects bluff. Trust destroyed.
- “I do not know.” (full stop). True but missed opportunity. Always pair with reasoning or a next-step.
- Cal Newport, “Deep Work” — the mental discipline of staying calm under question pressure.
- “Cracking the Coding Interview” by Gayle McDowell — has a chapter on handling difficult questions.
- “How Will You Measure Your Life?” by Clayton Christensen — broader, but the principle of intellectual honesty applies directly.
What are the top 5 OS topics most likely to come up at FAANG-level interviews, and how should I prepare for each?
What are the top 5 OS topics most likely to come up at FAANG-level interviews, and how should I prepare for each?
- Process / Thread / Concurrency model. Almost every interview touches it. Prepare: clone() and what flags do, fork+exec sequence, thread vs process trade-offs, GIL or equivalent in your language, mutex vs spinlock vs RCU. Practice: implement a thread pool from scratch.
- Virtual memory and page tables. The classic deep-dive question. Prepare: 4-level page table walk on x86_64, TLB and TLB shootdowns, COW, demand paging, swap vs RAM, mmap vs read/write. Practice: trace
malloc(8MB)end-to-end with all the kernel actions. - System calls and the syscall path. Bread-and-butter for systems roles. Prepare: SYSCALL instruction mechanics, ring transitions, vDSO, syscall table, context switching cost. Practice: walk through
read()from user code to disk and back. - Synchronization primitives and their trade-offs. Always asked because concurrency bugs are universal. Prepare: spinlock vs mutex vs futex, RCU read-side vs write-side cost, lock-free structures (CAS, atomic), memory ordering (acquire/release/seqcst). Practice: explain why a double-checked lock fails without the right memory barrier.
- Performance debugging and observability. Increasingly common, especially for SRE/infra roles. Prepare: USE method, RED method, perf, ftrace, eBPF, flame graphs, off-CPU analysis. Practice: walk through a “p99 latency spike” investigation from scratch.
- “Just read ‘Operating Systems: Three Easy Pieces’ cover to cover.” Good book, but reading is not preparation. You must practice answering questions out loud, on whiteboards, under time pressure. Reading alone is necessary but not sufficient.
- “Memorize a list of 100 questions and answers.” Memorization without first-principles fails on follow-ups. Use questions as practice prompts, not scripts.
- “Operating Systems: Three Easy Pieces” (free online) — the textbook that gets the foundations right.
- “Cracking the Coding Interview” by Gayle McDowell — not OS-specific but the strategy chapters apply.
- Interview Cake and System Design Primer (free GitHub repo) — have explicit OS sections.
Key Takeaways
Know the Fundamentals
Practice Debugging
Design Questions
Know Linux Specifics
Next: Case Studies →
Interview Deep-Dive
A process and a thread both appear in 'ps' output on Linux. Under the hood, how does Linux actually implement threads, and what is the real difference at the kernel level?
A process and a thread both appear in 'ps' output on Linux. Under the hood, how does Linux actually implement threads, and what is the real difference at the kernel level?
- In Linux, there is no fundamental distinction between a process and a thread at the kernel level. Both are represented by a
task_struct. The difference is in what they share. When you callfork(), the kernel creates a new task_struct with a new memory space (new page tables via copy-on-write), new file descriptor table, new signal handlers — a full copy. When you callpthread_create()(which internally usesclone()with specific flags), the kernel creates a new task_struct that SHARES the parent’s memory space, file descriptor table, signal handlers, and more. - The
clone()syscall is the unified primitive. The flags control what is shared:CLONE_VM(share memory),CLONE_FILES(share file descriptors),CLONE_SIGHAND(share signal handlers),CLONE_THREAD(same thread group). A “process” is clone() with no sharing flags; a “thread” is clone() with all sharing flags. - From the scheduler’s perspective, every task_struct is a schedulable entity. The scheduler does not distinguish between processes and threads. Context switching between threads of the same process is faster because the page tables (CR3 register on x86) do not change, avoiding a TLB flush.
- The practical implication: a crash (segfault) in one thread takes down all threads in the process because they share the same address space. A crash in a child process leaves the parent unaffected because they have separate address spaces. This is the core trade-off: threads give you cheap communication (shared memory) at the cost of fault isolation.
Compare CFS, EEVDF, and a real-time scheduling policy like SCHED_FIFO. When would you choose each in a production system?
Compare CFS, EEVDF, and a real-time scheduling policy like SCHED_FIFO. When would you choose each in a production system?
- CFS (Completely Fair Scheduler) was the default Linux scheduler for over 15 years. It tracks “virtual runtime” (vruntime) for each task — how much CPU time the task has received relative to its fair share. Tasks with the lowest vruntime are scheduled next, stored in a red-black tree for O(log n) selection. Nice values adjust the weight (not priority) so higher-nice tasks accumulate vruntime faster and get less CPU. CFS provides good fairness and interactive responsiveness for general-purpose workloads.
- EEVDF (Earliest Eligible Virtual Deadline First) replaced CFS in Linux 6.6. It adds a “virtual deadline” concept: each task has a deadline based on its requested time slice. Among eligible tasks (those whose virtual time allows them to run), the one with the earliest deadline runs next. EEVDF provides better latency guarantees than CFS — it reduces tail latency for interactive tasks by preventing “sleeper bonus” gaming and providing more predictable scheduling.
- SCHED_FIFO is a real-time policy. A SCHED_FIFO task runs until it voluntarily yields or is preempted by a higher-priority SCHED_FIFO task. It is not fair — a SCHED_FIFO task at priority 99 will starve everything else indefinitely. It is used when you need deterministic, bounded latency: audio processing (JACK), industrial control, low-latency trading.
- In production: use the default (EEVDF/CFS) for web servers, databases, and general applications. Use SCHED_FIFO for latency-critical real-time components (but be careful — a runaway SCHED_FIFO task can hang the system). Use cgroups with cpu.max to limit how much CPU a group of tasks can consume, regardless of scheduler policy.
Walk me through what happens at the OS level when you type 'ls' in a bash terminal and press Enter. Be as detailed as possible.
Walk me through what happens at the OS level when you type 'ls' in a bash terminal and press Enter. Be as detailed as possible.
- The terminal emulator captures the keystroke and writes the character to the pseudoterminal master (ptmx). The kernel’s tty layer passes it to the slave pty, which is bash’s stdin. Bash’s read() returns when it gets the newline.
- Bash parses “ls”, searches $PATH for the executable, and finds /usr/bin/ls (likely cached from a previous hash). Bash calls
fork()which creates a child process via clone(). The kernel creates a new task_struct, copies the page tables with COW (copy-on-write), copies the file descriptor table, and assigns a new PID. - In the child, bash calls
execve("/usr/bin/ls", ...). The kernel loads the ELF binary: parses the ELF header, creates new VMAs (virtual memory areas) for the text, data, and BSS segments, sets up the stack with argv and envp, and points the instruction pointer to the ELF entry point (or to the dynamic linker ld-linux.so if dynamically linked). - The dynamic linker runs first: it loads shared libraries (libc.so, libpthread.so, etc.) using mmap(), resolves symbol references, and jumps to the main program.
- ls calls opendir() -> getdents() syscall, which reads directory entries from the filesystem. The kernel does a path walk through the dentry cache (fast if cached) to the target directory’s inode, reads the directory data blocks, and returns the entries to user space.
- ls calls stat() on each entry to get metadata (file type, permissions, size), formats the output, calls write() to stdout (which is the slave pty), the tty layer passes it to the master pty, and the terminal emulator renders it.
- ls calls exit(). The kernel sends SIGCHLD to bash (the parent). Bash’s wait4() reaps the child, reads the exit status, and prints a new prompt.