Operating System Interfaces - Complete Study Guide
Learn operating system fundamentals by understanding the xv6 teaching operating system - a simple, Unix-like OS that demonstrates core OS concepts in ~9,000 lines of readable code.Core Concept: What is an Operating System?
An operating system has three primary jobs:Abstract Hardware
Programs don’t need to know specific hardware details (e.g., which disk type, GPU model)
Share Resources
Multiple programs run simultaneously (or appear to) by multiplexing CPU, memory, and I/O
Enable Interaction
Programs can communicate and share data safely through controlled mechanisms
Key Design Tension
[!IMPORTANT] The Interface Dilemma
- Simple/narrow interface = easier to implement correctly, but limited features
- Complex interface = more features but harder to maintain and secure
- Solution: Few powerful mechanisms that combine for generality
The xv6 Operating System
What is xv6?
xv6 is a teaching OS based on Unix design by Ken Thompson & Dennis Ritchie. It provides basic Unix interfaces and helps you understand modern operating systems like BSD, Linux, macOS, Solaris, and even Windows. Key Features:- Small enough to understand completely (~9,000 lines)
- Real enough to be useful (actual working OS)
- Provides essential Unix interfaces
- Runs on RISC-V architecture
Architecture Overview
Key Terms:
- Process: Running program with memory (instructions, data, stack)
- Kernel: Special program providing services to processes
- System Call: Process requests kernel service (e.g.,
fork(),open(),read()) - User Space: Normal programs run here (limited privileges)
- Kernel Space: Kernel runs here (full hardware access)
System Calls Reference
Process Management
| System Call | Description | Return Value |
|---|---|---|
fork() | Create new process | Child PID to parent, 0 to child |
exit(status) | Terminate process | No return (0=success, 1=failure) |
wait(*status) | Wait for child exit | Child PID or -1 |
kill(pid) | Terminate process | 0 or -1 |
getpid() | Get current process PID | PID |
sleep(n) | Pause for n clock ticks | 0 |
exec(file, argv[]) | Replace process with new program | No return on success, -1 on error |
sbrk(n) | Grow memory by n bytes | Address of new memory |
File Operations
| System Call | Description | Return Value |
|---|---|---|
open(file, flags) | Open file | File descriptor (fd) or -1 |
read(fd, buf, n) | Read n bytes | Bytes read or 0 at EOF |
write(fd, buf, n) | Write n bytes | Bytes written |
close(fd) | Release file descriptor | 0 or -1 |
dup(fd) | Duplicate fd | New fd to same file |
pipe(p[]) | Create pipe | 0 (p[0]=read, p[1]=write) |
File System
| System Call | Description | Return Value |
|---|---|---|
chdir(dir) | Change current directory | 0 or -1 |
mkdir(dir) | Create directory | 0 or -1 |
mknod(file, major, minor) | Create device file | 0 or -1 |
fstat(fd, *st) | Get file info from fd | 0 or -1 |
stat(file, *st) | Get file info from path | 0 or -1 |
link(file1, file2) | Create new name for file | 0 or -1 |
unlink(file) | Remove file name | 0 or -1 |
[!NOTE] Convention: Unless stated otherwise, system calls return 0 for success and -1 for error.
Processes and Memory
Process Structure
Each process has:- User-space memory: instructions + data + stack
- Kernel-private state: CPU registers, PID, etc.
- PID: Process identifier (unique number)
fork() - Creating Processes
Thefork() system call creates an exact copy of the parent process.
Behavior:
- Creates exact copy of parent process memory
- Both processes continue execution after
fork() - Only difference: return value
- Parent gets: child’s PID (positive number)
- Child gets: 0
- Error: -1
Example:
[!IMPORTANT] Critical Points:
- Parent and child have SEPARATE memory
- Changing variable in one doesn’t affect other
- Both start with identical memory contents
- Each has separate registers
wait() - Synchronizing Processes
Purpose: Parent waits for child to finish Behavior:- Returns PID of exited child
- Copies child’s exit status to provided address
- If no children exited yet, blocks until one does
- If no children exist, returns -1 immediately
exec() - Running Programs
Purpose: Replace current process with new program[!WARNING] Critical characteristic: Does NOT create new process, replaces current one!Arguments:
- Path to executable file
- Array of string arguments (NULL-terminated)
- Loads new program from file
- Replaces process memory completely
- Preserves: PID, file descriptors, open files
- If successful: NEVER returns (new program runs)
- If error: returns to caller
[!NOTE] argv[0] convention: First argument is program name (mostly ignored by program)
How Shell Uses These Calls
Shell main loop:echo hello:
- Shell forks
- Parent waits
- Child calls
exec("/bin/echo", ["echo", "hello", 0]) - echo runs and calls
exit() - Parent’s
wait()returns - Shell ready for next command
- Allows I/O redirection between
fork()andexec() - Shell can modify child’s file descriptors before
exec() - Parent’s I/O remains unchanged
- Combined
forkexec()call - awkward for I/O redirection - Shell modifies own I/O, then undoes - error-prone
- Every program handles own redirection - duplicated work
Memory Management
Implicit allocation:fork()- Allocates memory for child’s copyexec()- Allocates memory for new executable
sbrk(n)- Grow data memory by n bytes- Returns location of new memory
- Used by
malloc()implementation
fork()uses copy-on-write (COW)- Doesn’t actually copy memory until modified
- Avoids waste when
exec()immediately followsfork()
I/O and File Descriptors
File Descriptor Concept
Definition: Small integer representing kernel-managed I/O object Can refer to:- Regular files
- Directories
- Devices (keyboard, screen, etc.)
- Pipes
- Open file/directory/device
- Create pipe
- Duplicate existing descriptor
File Descriptor Table
Per-process table:
- Each process has private fd space
- FDs start at 0
- Kernel uses fd as index into table
- FD 0 = standard input (stdin)
- FD 1 = standard output (stdout)
- FD 2 = standard error (stderr)
read() and write()
read(fd, buf, n):- Reads UP TO n bytes from fd
- Copies into buf
- Returns number of bytes actually read
- Returns 0 at end of file
- Each fd has file offset that advances automatically
- Writes n bytes from buf to fd
- Returns number of bytes written
- Less than n only on error
- Offset advances automatically
cat Example - The Power of Abstraction
catdoesn’t know if reading from file, console, or pipecatdoesn’t know if writing to file, console, or pipe- Same code works for all cases
- Shell controls actual I/O sources/destinations
File Descriptor Allocation
close(fd):- Releases file descriptor
- Makes fd available for reuse
- Always uses LOWEST unused number
- This is critical for I/O redirection
I/O Redirection Mechanism
Example:cat < input.txt
- Child closes stdin (fd 0)
open()gets fd 0 (lowest available)exec()preserves file descriptor tablecatreads from fd 0, which now refers toinput.txt
open() Flags
Defined in fcntl.h:O_RDONLY- Read onlyO_WRONLY- Write onlyO_RDWR- Read and writeO_CREATE- Create if doesn’t existO_TRUNC- Truncate to zero length
fork() and File Descriptors
Behavior:fork() copies file descriptor table
Shared file offset example:
"hello world"
Why: Parent and child share same underlying file offset
- Child writes
"hello "at position 0 - Offset advances to 6
- Parent writes
"world\n"at position 6
dup() - Duplicating Descriptors
Purpose: Create new fd referring to same file Behavior:- Returns new fd (lowest available)
- Both fds share offset (like
fork())
"hello world" (sequential writes)
When offsets are shared:
- Derived from same original fd by
fork()/dup()
- Separate
open()calls, even for same file
Error Redirection
Command:ls existing non-existing > tmp1 2>&1
Meaning:
> tmp1- Redirect stdout (fd 1) to tmp12>&1- Redirect stderr (fd 2) to duplicate of fd 1
Pipes
Pipe Concept
Definition: Kernel buffer with two file descriptors- One fd for reading
- One fd for writing
Creation:
Pipe Example - Running wc
- Parent creates pipe
- Fork creates child with copy of pipe fds
- Child redirects stdin to pipe read end
- Child closes both pipe fds (already has copy at fd 0)
- Parent closes read end (won’t read)
- Parent writes data to pipe
- Parent closes write end (signals EOF)
- Child reads from stdin (pipe), processes, exits
[!IMPORTANT] Critical: Why close pipe fds?
- Child must close write end before
exec()- Otherwise
wcwould never see EOF (it has write end open)- If ANY process has write end open, read won’t return EOF
Pipe Blocking Behavior
When reading from pipe:- If data available: returns data
- If no data AND write end open: blocks (waits for data)
- If no data AND all write ends closed: returns 0 (EOF)
Shell Pipeline Implementation
Command:grep fork sh.c | wc -l
Process tree:
- Create pipe
- Fork for grep
- Redirect stdout to pipe write end
- exec grep
- Fork for wc
- Redirect stdin to pipe read end
- exec wc
- Wait for both children
a | b | c
- Leaves = commands
- Interior nodes = processes managing pipes
Pipes vs Temporary Files
Command comparison:- Auto cleanup - No temp files to delete
- Unlimited size - Not limited by disk space
- Parallel execution - Both programs run simultaneously
- Must manually clean up
- Need disk space for all data
- Sequential execution (first program must finish)
File System
Structure
Hierarchical organization:- Tree of directories
- Root directory:
/ - Data files: uninterpreted byte arrays
- Directories: named references to files/directories
/a/b/c- Absolute path from roota/b/c- Relative to current directory
Directory Navigation
Both open same file:- Changes process current directory to
/a, then/a/b - Opens
crelative to/a/b
- No directory change
- Direct absolute path
Creating Files and Directories
- First number = major device number
- Second number = minor device number
- Together uniquely identify kernel device
read()/write()calls diverted to device driver
Inodes and Links
Inode = actual file data structure
Contains:
- File type (file, directory, device)
- File length
- Location of content on disk
- Number of links to file
[!IMPORTANT] Key insight: File name ≠ file itselfExample:
- One inode can have multiple names (links)
- Each link is directory entry: name + inode reference
- One inode
- Two names:
"a"and"b" - Both refer to same content
- Reading/writing
"a"= reading/writing"b"
stat Structure
- Both
"a"and"b"have sameino nlink= 2
unlink() - Removing Names
Behavior:- Removes name from file system
- Decrements link count
- Inode and disk space freed ONLY when:
- Link count = 0 AND
- No file descriptors refer to it
- Name
"a"removed - File still accessible as
"b" - Inode unchanged
Built-in vs External Commands
External (user-level programs):mkdirlnrmlscat- Most commands
- Anyone can add new commands
- No need to modify kernel/shell
- Easier to extend system
cd
[!WARNING]
Problem: cd must change SHELL’s directory, not child’s
System Call Flow
Understanding how system calls work is crucial for OS internals.
Tracing fork() from User to Kernel
Step 1: User Program Calls fork()Real World Context
Unix Philosophy
“Software tools” culture:- Small programs doing one thing well
- Combine via pipes
- Shell as “scripting language”
- Standard file descriptors (0, 1, 2)
- Pipes for composition
- Simple but powerful shell syntax
POSIX Standard
Purpose: Standardize Unix system call interface xv6 vs POSIX:- xv6 NOT POSIX compliant
- Missing:
lseek(), many other calls - Different implementations of existing calls
- Goal: Simplicity for teaching
- Many more system calls
- Networking, windowing, threads
- Many device drivers
- Continuous evolution
Alternative Designs
Plan 9:- Extended “everything is a file” concept
- Applied to networks, graphics
- Most Unix systems didn’t follow
- Predecessor of Unix
- Files looked like memory
- Very different interface
- Complexity influenced Unix designers to simplify
- Different abstractions possible
- File descriptors not only solution
- Unix model proven very successful
xv6 Limitations
No user protection:- All processes run as root
- No isolation between users
- Teaching-focused tradeoff
- Minimal system calls
- No networking
- No windowing
- Basic functionality only
Key Takeaways
Design Principles
Simple Interfaces
Few powerful mechanisms that combine for generality
Hardware Abstraction
Hide hardware details behind clean interfaces
Process Isolation
Processes isolated from each other for safety
Controlled Communication
Explicit mechanisms for safe interaction
Core Abstractions
- Processes - Unit of execution with isolated memory
- File descriptors - Unified I/O interface for files, pipes, devices
- Pipes - Communication channels between processes
- File system - Persistent storage hierarchy with inodes and links
Why These Abstractions Work
File descriptors:- Hide differences (files, pipes, devices)
- Enable I/O redirection
- Simple but powerful
- Enables I/O redirection
- Shell controls child environment
- Parent unaffected
- Clean inter-process communication
- Better than temp files
- Enable parallel execution
- Separate names from content
- Multiple names for same file
- Automatic cleanup when unused
Getting Started with xv6
Installation (Ubuntu/Debian)
First Commands to Try
Interview Relevance
System Design Questions
System Design Questions
Understanding OS interfaces helps you:
- Design better APIs and abstractions
- Understand performance implications
- Make informed technology choices
- Debug production issues
Common Interview Topics
Common Interview Topics
- How does
fork()work? What’s copy-on-write? - Explain file descriptors and I/O redirection
- How do pipes work? When to use them?
- What’s the difference between hard links and symbolic links?
- How does the shell implement pipelines?
- Explain the system call mechanism
Senior-Level Expectations
Senior-Level Expectations
- Understand trade-offs in OS design
- Explain why Unix chose these abstractions
- Compare with alternative designs
- Discuss performance implications
- Debug issues at the system call level
Resources
Official Materials
- xv6 Book - Complete guide
- xv6 Source Code - GitHub repository
- MIT 6.S081 - OS Engineering course
Next Steps
Process Management
Deep dive into process lifecycle, scheduling, and context switching
Virtual Memory
Understand paging, page tables, and memory management
File Systems
Learn about inodes, directories, and file system implementation
Linux Internals
Explore how these concepts scale in production systems
[!TIP] Learning Path: Start by reading the xv6 book alongside the source code. Try modifying xv6 to add features or change behavior. This hands-on experience is invaluable for truly understanding operating systems.