Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Understanding computer science fundamentals helps you write better code, debug efficiently, and make informed architectural decisions. These concepts form the foundation for system design, performance optimization, and debugging complex issues. When a production database query suddenly takes 30 seconds instead of 30 milliseconds, it is your understanding of indexing, memory hierarchy, and I/O that lets you diagnose the problem in minutes rather than hours. When a distributed system silently drops writes under load, it is your grasp of CAP theorem and consistency models that tells you where to look.
Why This Matters: In interviews at top tech companies, 40-60% of system design questions require deep understanding of these fundamentals. More importantly, these concepts are the difference between an engineer who builds systems that survive contact with production and one whose systems collapse under real-world conditions.

Operating Systems

Process vs Thread

Process

  • Independent execution unit
  • Own memory space (isolated)
  • Higher overhead to create (~10ms)
  • Inter-process communication needed
  • Crash isolation (one crash doesn’t affect others)

Thread

  • Lightweight execution unit
  • Shared memory with parent process
  • Lower overhead to create (~1ms)
  • Direct memory sharing
  • One thread crash can crash entire process

When to Use Processes vs Threads

Think of processes as separate apartments in a building — each has its own kitchen, bathroom, and walls. Residents (data) are fully isolated. Threads are like roommates sharing one apartment — they share the kitchen and bathroom (memory), which is efficient but requires coordination so nobody walks in on someone else.
ScenarioUse ProcessUse Thread
Isolation needed✅ Browser tabs (Chrome uses one process per tab so a crashing tab does not take down the whole browser)
Shared state✅ Web server workers sharing a connection pool
CPU-bound tasks✅ Parallel processing (bypasses Python’s GIL)✅ (with limitations in languages with a GIL)
I/O-bound tasks✅ Database connections, API calls, file reads
Fault tolerance✅ Microservices, worker processes
# Process example (Python) -- use for CPU-heavy work.
# Each process gets its own Python interpreter and memory space,
# completely bypassing the Global Interpreter Lock (GIL).
from multiprocessing import Process

def cpu_intensive_task(n):
    """Squaring 10M numbers -- pure computation, no I/O waiting."""
    return sum(i * i for i in range(n))

# Creates a separate OS-level process with isolated memory.
# More expensive to create (~10ms) but achieves true parallelism.
p = Process(target=cpu_intensive_task, args=(10000000,))
p.start()
p.join()  # Block until the child process finishes

# Thread example -- use for I/O-heavy work (network calls, disk reads).
# While one thread waits for a network response, other threads can
# execute -- this is called "concurrency" rather than "parallelism."
from threading import Thread

def io_bound_task(url):
    """Network call -- the thread mostly waits for the server to respond."""
    response = requests.get(url)
    return response.text

# Shares memory with the main thread -- cheaper to create (~1ms)
# but be careful: shared memory means potential race conditions.
t = Thread(target=io_bound_task, args=("https://api.example.com",))
t.start()
t.join()
Practical tip: In Python specifically, use multiprocessing for CPU-bound work (math, image processing, data crunching) and threading or asyncio for I/O-bound work (API calls, database queries, file operations). Getting this wrong is a common source of confusing performance results.

Memory Management

┌─────────────────────────────────────┐
│           Virtual Memory            │
├─────────────────────────────────────┤
│  Stack (grows down)                 │
│    ↓  Local variables, function    │
│       calls, return addresses       │
│  ... (free space) ...               │
│    ↑  Dynamically allocated        │
│       memory (malloc, new)          │
│  Heap (grows up)                    │
├─────────────────────────────────────┤
│  BSS (uninitialized data)           │
│    Global/static vars (zero-init)   │
├─────────────────────────────────────┤
│  Data (initialized data)            │
│    Global/static vars with values   │
├─────────────────────────────────────┤
│  Text (code)                        │
│    Executable instructions          │
└─────────────────────────────────────┘

Stack vs Heap Memory

Think of stack memory like a stack of cafeteria trays — you can only add to the top and remove from the top (LIFO), but it is blazingly fast. Heap memory is like a warehouse — you can store anything anywhere, but finding and managing free space takes more effort.
AspectStackHeap
AllocationAutomatic (LIFO) — allocated when function is called, freed on returnManual (malloc/new) — you ask for memory, you free it
SpeedVery fast (just move a pointer)Slower (must search for free blocks, causes fragmentation)
SizeLimited (~1-8 MB per thread)Limited by available RAM (GBs)
ScopeFunction-local — dies when function returnsGlobal access via pointers — lives until explicitly freed
Thread SafetyEach thread has its own stack — inherently safeShared across threads — needs synchronization (locks, atomics)
Memory ErrorsStack overflow (infinite recursion, huge arrays)Memory leaks, dangling pointers, double-free, use-after-free
// Stack allocation - fast, automatic cleanup.
// The compiler knows exactly how much space to reserve at compile time.
void stackExample() {
    int x = 10;           // Stack: 4 bytes, allocated instantly
    int arr[100];         // Stack: 400 bytes, fixed at compile time
}  // ALL memory freed automatically when function returns -- no cleanup needed

// Heap allocation - flexible but you own the lifecycle.
// Use when: size unknown at compile time, data must outlive the function,
// or the allocation is too large for the stack (>1MB).
void heapExample() {
    int* ptr = malloc(sizeof(int) * 1000);  // Heap: ~4KB, allocated at runtime
    if (ptr == NULL) return;                 // Always check -- malloc can fail!
    // ... use ptr ...
    free(ptr);  // MUST manually free! Forgetting = memory leak.
    ptr = NULL; // Good practice: prevent use-after-free bugs
}
Why this matters in interviews: Languages like Java, Python, and Go have garbage collectors that handle heap cleanup automatically, but understanding manual memory management helps you reason about performance, understand why certain data structures are faster than others, and debug memory-related issues in production (memory leaks in long-running services are a top operational headache).

Concurrency Fundamentals

Deadlock Conditions (All 4 must be present)

A deadlock is like two people meeting in a narrow hallway — each insists the other go first, and neither moves. In software, this happens when threads hold resources and wait for each other’s resources in a cycle. The four Coffman conditions must ALL be present simultaneously for a deadlock to occur — break any one condition and deadlock becomes impossible.
  1. Mutual Exclusion: Resource can’t be shared (only one thread can hold a lock at a time)
  2. Hold and Wait: Process holds resource while waiting for another (holding lock A, waiting for lock B)
  3. No Preemption: Can’t force process to release resource (no timeout, no forced unlock)
  4. Circular Wait: Circular chain of processes waiting (A waits for B, B waits for A)
# Deadlock example -- this WILL hang indefinitely under the right timing.
# Thread 1 grabs lock_a, Thread 2 grabs lock_b, then both wait forever.
import threading

lock_a = threading.Lock()
lock_b = threading.Lock()

def thread_1():
    lock_a.acquire()        # Thread 1 holds lock_a
    time.sleep(0.1)         # Context switch happens here...
    lock_b.acquire()        # Thread 1 waits for lock_b (held by thread_2) -- STUCK
    # ...
    
def thread_2():
    lock_b.acquire()        # Thread 2 holds lock_b
    time.sleep(0.1)         # Context switch happens here...
    lock_a.acquire()        # Thread 2 waits for lock_a (held by thread_1) -- STUCK
    # ...

# Prevention strategy 1: Lock ordering -- always acquire locks in the SAME
# global order across ALL threads. This breaks the "circular wait" condition.
def thread_safe():
    lock_a.acquire()  # Always acquire A first, everywhere in the codebase
    lock_b.acquire()  # Then B -- no thread can hold B without already holding A
    # ...
    lock_b.release()  # Release in reverse order (good practice, not strictly required)
    lock_a.release()

# Prevention strategy 2: Use context managers with timeouts
# so a thread gives up instead of waiting forever.
def thread_with_timeout():
    if lock_a.acquire(timeout=5):
        if lock_b.acquire(timeout=5):
            try:
                pass  # Do work
            finally:
                lock_b.release()
        lock_a.release()

Race Condition

A race condition occurs when the correctness of your program depends on the timing of thread execution — like two people editing the same document simultaneously without seeing each other’s changes. The result depends on who saves last.
# Race condition example -- the classic "lost update" problem.
counter = 0

def increment():
    global counter
    for _ in range(100000):
        # counter += 1 looks atomic but is actually THREE operations:
        #   1. READ current value of counter into a register
        #   2. ADD 1 to the register
        #   3. WRITE the new value back to counter
        # If two threads both READ the same value (say, 42), they both
        # write 43 -- and you lose one increment entirely.
        counter += 1  # NOT atomic! Read-modify-write

# Two threads, expected: 200000, actual: some random lower number
# Run it 10 times and you will get a different wrong answer each time.

# Fix with Lock -- guarantees mutual exclusion around the critical section.
lock = threading.Lock()

def safe_increment():
    global counter
    for _ in range(100000):
        with lock:           # Only one thread can execute this block at a time
            counter += 1     # Now the read-modify-write is protected
Real-world example: Double-spending in payment systems is a race condition. Two near-simultaneous purchase requests both read the same balance (100),bothapprove(100), both approve (80 purchase), and both write the new balance (20)buttheuserspent20) -- but the user spent 160 total. This is why financial systems use database transactions with proper isolation levels.

Key Concepts

ConceptDescriptionInterview Relevance
DeadlockCircular wait for resourcesHigh
Race ConditionUnpredictable behavior from timingVery High
MutexBinary lock (0 or 1)High
SemaphoreCounter-based lock (0 to N)High
Virtual MemoryAbstraction over physical memoryMedium
Context SwitchingSaving/restoring process stateMedium
Page FaultAccessing memory not in RAMMedium
ThrashingExcessive paging, system slowdownMedium

Process Scheduling Algorithms

AlgorithmDescriptionProsCons
FCFSFirst Come First ServedSimpleConvoy effect
SJFShortest Job FirstOptimal avg waitStarvation possible
Round RobinTime quantum rotationFairHigh context switching
PriorityHigher priority firstImportant tasks fastStarvation
Multilevel QueueMultiple queues with priorityFlexibleComplex

Networking

OSI Model (Simplified)

The OSI model is like the postal system for the internet. Layer 1 is the road the mail truck drives on. Layer 2 is the local post office that knows about addresses on its street. Layer 3 is the routing system that figures out which city to send the mail to. Layer 4 ensures the package arrives intact (or decides speed matters more than guaranteed delivery). Layer 7 is the content of the letter itself.
Layer 7: Application  (HTTP, FTP, DNS, SMTP)
        └── What the user interacts with -- "I want this web page"
        
Layer 4: Transport    (TCP, UDP)
        └── End-to-end communication, port numbers, reliability decisions
        
Layer 3: Network      (IP, ICMP, Routing)
        └── Logical addressing (IP addresses), routing across networks
        
Layer 2: Data Link    (Ethernet, MAC, Switches)
        └── Physical addressing (MAC addresses), local network delivery
        
Layer 1: Physical     (Cables, Signals, Hubs)
        └── Raw bit transmission over copper, fiber, or radio waves
Practical tip: In interviews and debugging, you mostly care about Layers 4 and 7. “Is this a TCP connection issue or an HTTP-level error?” is the first diagnostic question you should ask when a network call fails.

TCP Three-Way Handshake

Client                    Server
  │                         │
  │───── SYN (seq=x) ──────►│  1. Client initiates
  │                         │
  │◄── SYN-ACK (seq=y,      │  2. Server acknowledges
  │     ack=x+1) ───────────│     and sends its own SYN
  │                         │
  │───── ACK (ack=y+1) ────►│  3. Client acknowledges
  │                         │
  │◄═══ Connection ════════►│  4. Ready to transfer data

TCP vs UDP

TCP

  • Connection-oriented (handshake)
  • Reliable delivery (ACKs, retransmission)
  • Ordered packets (sequence numbers)
  • Flow control (sliding window)
  • Congestion control
  • Use: Web, Email, File transfer, SSH

UDP

  • Connectionless (no handshake)
  • Best-effort delivery (no guarantees)
  • No ordering guarantee
  • No flow/congestion control
  • Lower latency, less overhead
  • Use: Gaming, Streaming, DNS, VoIP

DNS Resolution Process

DNS is the internet’s phone book — it translates human-readable names (google.com) into machine-readable IP addresses (142.250.80.46). The resolution process works like asking for directions: you ask increasingly specific authorities until someone knows the exact answer.
1. Browser Cache      → Check local browser cache (fastest, ~0ms)
2. OS Cache           → Check operating system DNS cache (~1ms)
3. Resolver Cache     → ISP's recursive resolver cache (~5ms)
4. Root Server        → "Who handles .com?" (13 root server clusters worldwide)
5. TLD Server         → "Who handles example.com?" (Verisign for .com)
6. Authoritative NS   → "example.com = 93.184.216.34" (the definitive answer)
Most lookups resolve at step 1-3 (caches). The full 6-step resolution only happens for domains never visited before or after TTL expiry. A typical full resolution takes 50-200ms — which is why DNS prefetching and caching are critical performance optimizations.
# Trace DNS resolution
nslookup example.com
dig example.com +trace

HTTP/1.1 vs HTTP/2 vs HTTP/3

FeatureHTTP/1.1HTTP/2HTTP/3
ConnectionMultiple TCPSingle TCP, multiplexedQUIC (UDP-based)
Head-of-line blockingYesAt TCP levelNo
Header compressionNoHPACKQPACK
Server pushNoYesYes
EncryptionOptionalPractically requiredRequired (TLS 1.3)

HTTP Methods & Status Codes

# Common HTTP Methods
GET     # Retrieve resource (idempotent, cacheable)
POST    # Create resource (not idempotent)
PUT     # Update/Replace resource (idempotent)
PATCH   # Partial update (not necessarily idempotent)
DELETE  # Remove resource (idempotent)
HEAD    # GET without body (check if resource exists)
OPTIONS # Supported methods (CORS preflight)

# Status Code Categories
1xx  # Informational (100 Continue)
2xx  # Success (200 OK, 201 Created, 204 No Content)
3xx  # Redirection (301 Moved Permanently, 302 Found, 304 Not Modified)
4xx  # Client Error (400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 429 Too Many Requests)
5xx  # Server Error (500 Internal, 502 Bad Gateway, 503 Service Unavailable, 504 Gateway Timeout)

WebSocket vs HTTP

HTTP: Request-Response (Half-duplex)
Client ───Request──► Server
Client ◄──Response── Server
Client ───Request──► Server
Client ◄──Response── Server

WebSocket: Full-duplex, persistent connection
Client ◄═════════════► Server
       │ Real-time    │
       │ bidirectional│
       │ messages     │

Database Fundamentals

ACID Properties

ACID is the set of guarantees that makes databases trustworthy for critical operations. Without ACID, a power failure during a bank transfer could debit your account without crediting the recipient — money literally vanishes. Every relational database provides these guarantees; the question is which trade-offs you accept (see isolation levels below).
PropertyDescriptionExample
AtomicityAll or nothing — partial work is rolled backBank transfer: debit AND credit succeed or both fail. Never “money disappeared.”
ConsistencyDatabase moves from one valid state to anotherBalance can’t go negative, foreign keys always reference existing rows
IsolationConcurrent transactions behave as if run sequentiallyTwo users buying the last item: only one succeeds, the other sees “out of stock”
DurabilityCommitted data survives crashes, power failuresWrite-ahead logging (WAL) ensures data on disk before confirming commit

Transaction Isolation Levels

LevelDirty ReadNon-Repeatable ReadPhantom ReadPerformance
Read Uncommitted✅ Possible✅ Possible✅ PossibleFastest
Read Committed❌ Prevented✅ Possible✅ PossibleFast
Repeatable Read❌ Prevented❌ Prevented✅ PossibleMedium
Serializable❌ Prevented❌ Prevented❌ PreventedSlowest
-- Set isolation level
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

BEGIN TRANSACTION;
-- Your queries here
COMMIT;

SQL vs NoSQL

SQL (Relational)

  • Structured schema (tables, rows, columns)
  • ACID compliant
  • Complex queries (JOINs, aggregations)
  • Vertical scaling (scale up)
  • Examples: PostgreSQL, MySQL, Oracle
  • Use: Transactions, Analytics, Complex relationships

NoSQL

  • Flexible schema (documents, key-value, graphs)
  • BASE (Eventually consistent)
  • Simple queries (by key/ID)
  • Horizontal scaling (scale out)
  • Examples: MongoDB, Redis, Cassandra, Neo4j
  • Use: Real-time, Big Data, Caching, Social graphs

NoSQL Types

TypeData ModelUse CaseExample
Key-ValueKey → ValueCaching, SessionsRedis, DynamoDB
DocumentJSON documentsContent management, CatalogsMongoDB, Couchbase
Column-FamilyWide columnsTime-series, AnalyticsCassandra, HBase
GraphNodes + EdgesSocial networks, RecommendationsNeo4j, Amazon Neptune

Indexing Deep Dive

A database index works like a book’s index: instead of reading every page to find “PostgreSQL,” you look up “P” in the index and jump straight to page 247. Without an index, the database performs a full table scan — reading every single row. On a table with 10 million rows, this is the difference between 5ms and 5 seconds.
-- B-Tree Index (default for most DBs)
-- Stores data in a balanced tree structure -- O(log n) lookups.
-- Good for: range queries, sorting, equality, prefix matching.
-- Think of it as a phone book: sorted by name, fast to find "Smith."
CREATE INDEX idx_user_email ON users(email);

-- Hash Index
-- Stores a hash of the key for O(1) lookups -- blazing fast for exact matches.
-- Good for: equality comparisons ONLY (not range queries, not sorting).
-- Think of it as a dictionary: instant lookup, but you cannot browse alphabetically.
CREATE INDEX idx_user_id ON users USING HASH(id);

-- Composite Index (column order matters enormously!)
-- Follows the "leftmost prefix" rule -- the index is usable for queries that
-- filter on the first column, or the first two columns, or all three.
-- Think of it like a phone book sorted by (last_name, first_name, city).
CREATE INDEX idx_orders ON orders(user_id, status, created_at);
-- ✅ Uses index: WHERE user_id = 1                     (leftmost column)
-- ✅ Uses index: WHERE user_id = 1 AND status = 'pending'     (left two)
-- ✅ Uses index: WHERE user_id = 1 AND status = 'pending' AND created_at > '2024-01-01'
-- ❌ Skips index: WHERE status = 'pending'              (missing leftmost column!)
-- ❌ Skips index: WHERE user_id = 1 AND created_at > '2024-01-01' (gap in middle!)

-- Covering Index (includes all needed columns)
CREATE INDEX idx_users_covering ON users(email) INCLUDE (name, created_at);
-- Query can be answered entirely from index, no table lookup needed

-- Partial Index (index subset of rows)
CREATE INDEX idx_active_users ON users(email) WHERE status = 'active';

When NOT to Use Indexes

Indexes are not free — every index slows down writes because the database must update both the table and the index on every INSERT, UPDATE, or DELETE. A table with 10 indexes means 11 writes for every data change. Know when to skip them:
  • High write tables: Each index adds write overhead. A logging table with millions of inserts/second should have minimal indexes.
  • Small tables (under ~1,000 rows): Full table scan may be faster than the overhead of reading the index then jumping to the table.
  • Low cardinality columns: A boolean is_active column has only 2 values — an index barely narrows the search. Exception: if 99% of rows are false and you query for true, a partial index is useful.
  • Frequently updated columns: Every update rewrites the index entry. A last_seen_at timestamp updated on every request is a poor indexing candidate.

Database Normalization

Normal FormRuleExample Violation
1NFNo repeating groups, atomic valuesphone: "123,456,789"
2NFNo partial dependenciesNon-key depends on part of composite key
3NFNo transitive dependencieszipcitystate
BCNFEvery determinant is a candidate keyMore strict 3NF
-- Denormalized (violation of 1NF)
CREATE TABLE orders (
    id INT,
    products VARCHAR(255)  -- "laptop,mouse,keyboard"
);

-- Normalized
CREATE TABLE orders (id INT PRIMARY KEY);
CREATE TABLE order_items (
    order_id INT REFERENCES orders(id),
    product_id INT REFERENCES products(id),
    quantity INT
);

Computer Architecture

CPU Cache Hierarchy

┌─────────────────────────────────────────────┐
│                   CPU                       │
│  ┌─────────────────────────────────────┐   │
│  │           Core                      │   │
│  │  ┌─────────────────────────────┐   │   │
│  │  │   L1 Cache (32-64KB)       │   │   │  ~1ns
│  │  │   Fastest, per-core        │   │   │
│  │  └─────────────────────────────┘   │   │
│  │  ┌─────────────────────────────┐   │   │
│  │  │   L2 Cache (256KB-1MB)     │   │   │  ~3-10ns
│  │  │   Per-core                 │   │   │
│  │  └─────────────────────────────┘   │   │
│  └─────────────────────────────────────┘   │
│  ┌─────────────────────────────────────┐   │
│  │   L3 Cache (8-64MB)                │   │  ~10-40ns
│  │   Shared across cores              │   │
│  └─────────────────────────────────────┘   │
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│   RAM (8-128GB+)                           │  ~100ns
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│   SSD (~1TB)                               │  ~100μs
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│   HDD (~10TB)                              │  ~10ms
└─────────────────────────────────────────────┘

Latency Numbers Every Programmer Should Know

OperationTime
L1 cache reference0.5 ns
Branch mispredict5 ns
L2 cache reference7 ns
Mutex lock/unlock25 ns
Main memory reference100 ns
Compress 1KB with Zippy3,000 ns (3 μs)
Send 1KB over 1 Gbps network10,000 ns (10 μs)
Read 4KB randomly from SSD150,000 ns (150 μs)
Round trip within datacenter500,000 ns (0.5 ms)
Read 1 MB sequentially from SSD1,000,000 ns (1 ms)
Disk seek10,000,000 ns (10 ms)
Read 1 MB sequentially from disk20,000,000 ns (20 ms)
Send packet CA → Netherlands → CA150,000,000 ns (150 ms)

Distributed Systems Basics

CAP Theorem

In a distributed system, you can only guarantee 2 of 3: Consistency, Availability, Partition Tolerance
In practice, network partitions WILL happen (cables get cut, switches fail, cloud regions go down), so you are really choosing between Consistency and Availability during a partition. The question is not “CP or AP?” but rather “when the network splits, should the system refuse requests (consistent but unavailable) or serve potentially stale data (available but inconsistent)?” Banks choose consistency (better to reject a transaction than process it twice). Social media feeds choose availability (showing a slightly stale feed is better than showing an error page).
         Consistency
            /\
           /  \
          /    \
         /  CA  \
        /________\
       /\        /\
      /  \  CP  /  \
     / AP \    /    \
    /______\  /______\
Availability  Partition
              Tolerance
ChoiceTrade-offExamples
CPSacrifice availability for consistencyMongoDB, HBase, Redis Cluster
APSacrifice consistency for availabilityCassandra, DynamoDB, CouchDB
CANot practical (network partitions happen)Traditional RDBMS (single node)

Consistency Models

Understanding consistency models is critical because they define what “correct” means in your distributed system. Choosing the wrong model leads to bugs that are nearly impossible to reproduce in development but happen constantly in production.
ModelDescriptionUse CaseReal-World Analogy
StrongAll reads see the latest write, everywhere, immediatelyBanking, Inventory countsA shared whiteboard where everyone instantly sees every change
EventualReads will eventually see the latest write (seconds or minutes of lag)Social media feeds, DNSA memo distributed to all offices — eventually everyone gets it
CausalRespects cause-effect ordering (if A caused B, everyone sees A before B)Messaging apps, commentsReply always appears after the message it replies to
Read-your-writesA user always sees their own writes immediately (others may lag)User profile updatesYou see your own post instantly, friends see it seconds later

Practice Questions

  1. Browser checks caches: Browser cache, OS cache, router cache
  2. DNS resolution: Resolve domain to IP address
    • Check local DNS cache
    • Query recursive DNS resolver
    • Query root → TLD → authoritative nameserver
  3. TCP connection: 3-way handshake (SYN → SYN-ACK → ACK)
  4. TLS handshake (if HTTPS):
    • Client Hello (supported ciphers)
    • Server Hello + Certificate
    • Key exchange
    • Encrypted connection established
  5. HTTP request sent: GET / HTTP/1.1 with headers
  6. Server processes request:
    • Load balancer routes to server
    • Server processes and queries database
    • Generates response
  7. HTTP response received: HTML content with status code
  8. Browser renders:
    • Parse HTML → DOM tree
    • Parse CSS → CSSOM
    • Execute JavaScript
    • Render pixels to screen
Indexes are data structures (usually B-trees) that maintain sorted references to rows.B-Tree Structure:
  • Balanced tree with sorted keys
  • Each node can have multiple children
  • Leaf nodes contain pointers to actual rows
How lookup works:
  • Instead of scanning all rows O(n)
  • Binary search through tree O(log n)
  • Follow pointers to find matching rows
Trade-offs:
  • Faster reads (logarithmic lookup)
  • Slower writes (must update index)
  • Additional storage space
Deadlock occurs when: Four conditions are ALL met (Coffman conditions):
  1. Mutual exclusion - resource can’t be shared
  2. Hold and wait - holding one resource, waiting for another
  3. No preemption - can’t force release
  4. Circular wait - A waits for B, B waits for A
Prevention strategies:
  • Lock ordering: Always acquire locks in same global order
  • Lock timeout: Give up after waiting too long
  • Deadlock detection: Detect and kill one transaction
  • Try-lock: Non-blocking attempt, back off if fails
  • Avoid nested locks: Minimize lock scope
Process:
  • Independent execution unit with own memory space
  • Isolated - crash doesn’t affect other processes
  • Higher creation/switching overhead (~10ms)
  • Communication via IPC (pipes, sockets, shared memory)
Thread:
  • Lightweight execution unit within a process
  • Shared memory with parent and sibling threads
  • Lower creation/switching overhead (~1ms)
  • Direct memory sharing (but needs synchronization)
When to use which:
  • Processes: Isolation needed, fault tolerance, security
  • Threads: Shared state, I/O-bound tasks, lower overhead needed
TCP (Transmission Control Protocol):
  • Connection-oriented (handshake required)
  • Reliable (acknowledgments, retransmission)
  • Ordered (sequence numbers)
  • Flow control (sliding window)
  • Congestion control
  • Use: Web, email, file transfer, SSH
UDP (User Datagram Protocol):
  • Connectionless (no handshake)
  • Unreliable (no guarantees)
  • Unordered (packets may arrive out of order)
  • No flow/congestion control
  • Lower latency, less overhead
  • Use: Gaming, video streaming, DNS, VoIP
Trade-off: Reliability vs Speed
Read Uncommitted: Can see uncommitted changes (dirty reads)Read Committed: Only see committed changes, but same query may return different resultsRepeatable Read: Same query returns same results within transaction, but new rows may appear (phantoms)Serializable: Transactions execute as if sequential - no anomalies, but slowestProblems prevented by level:
  • Dirty read: See uncommitted data
  • Non-repeatable read: Same query, different results
  • Phantom read: New rows appear matching query

Quick Reference Card

Numbers to Remember

MetricValue
L1 cache~1 ns
L2 cache~10 ns
RAM~100 ns
SSD random read~150 μs
HDD seek~10 ms
Same datacenter RTT~0.5 ms
Cross-region RTT~100-150 ms

Common Ports

PortService
22SSH
80HTTP
443HTTPS
3306MySQL
5432PostgreSQL
6379Redis
27017MongoDB
Interview Tip: Be ready to explain these concepts with real-world examples. Interviewers love when you can relate theory to practical scenarios — “A deadlock is like two trains on a single track approaching each other: neither can move forward, and neither will back up.” The ability to use precise analogies demonstrates that you truly understand the concept, not just memorized the definition. Use the “explain like I’m 5” technique to show deep understanding, then layer in technical depth.

Interview Deep-Dive

Strong Answer:
  • First, I run EXPLAIN ANALYZE on the slow query to see the actual execution plan. I am looking for a Seq Scan (full table scan) where I expect an Index Scan. A table growing from 500K to 50M means a missing or ineffective index is the most likely cause — a full scan that was “fast enough” on 500K rows becomes catastrophic at 50M.
  • Second, I check if the query uses functions on indexed columns. A common trap: WHERE YEAR(created_at) = 2024 cannot use a B-tree index on created_at because the database must evaluate the function on every row. Rewriting it as WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01' lets the B-tree do its job.
  • Third, I check for composite index order issues. If the query filters on user_id and status but the composite index is (status, user_id), and most queries only provide user_id, the index is useless due to the leftmost prefix rule. Column order in composite indexes must match query patterns.
  • Fourth, I look at table bloat and vacuum status. In PostgreSQL, dead tuples from updates and deletes accumulate if autovacuum is not keeping up. A table with 50M live rows but 200M dead tuples means the database is scanning 250M rows worth of pages. Running VACUUM ANALYZE reclaims space and updates statistics so the query planner makes better decisions.
  • Fifth, I check if the query planner’s statistics are stale. After a large data load, the planner may estimate 1,000 rows when the actual result is 10 million, causing it to choose a nested loop join instead of a hash join. Running ANALYZE on the table updates the statistics.
Follow-up: The EXPLAIN output shows the index is being used, but the query is still slow. What else could be going on?Several possibilities: (1) The index scan returns too many rows — if the query matches 40 million of the 50 million rows, an index scan is actually slower than a sequential scan because of the random I/O pattern (jumping around the disk to fetch each row). The planner should detect this, but stale statistics can mislead it. (2) The query is doing a massive sort or aggregation after the index lookup. Check for “Sort Method: external merge Disk” in the EXPLAIN output — this means the sort spilled to disk because work_mem is too small. Increasing work_mem for this query can fix it. (3) Lock contention — the query is fast in isolation but slow in production because it is waiting for a lock held by a long-running transaction. Check pg_stat_activity for blocked queries. (4) The data is not in PostgreSQL’s shared buffer cache and every page fetch is a disk read. On a cold cache, 50M rows means significant I/O. If this query runs frequently, it will eventually be cached. If it runs rarely, consider a covering index so the query can be answered entirely from the index without touching the heap.
Strong Answer:
  • The CAP theorem states that in a distributed system, during a network partition, you must choose between consistency (every read returns the most recent write) and availability (every request receives a response). You cannot have both simultaneously when the network is partitioned.
  • The key nuance that most people miss: CAP only applies during a partition. When the network is healthy, you can have both consistency and availability. The real question is: “When the network inevitably partitions, which property do you sacrifice?”
  • What I would tell the junior engineer: “You can absolutely have strong consistency and high availability — as long as you run on a single node. The moment you distribute across nodes for fault tolerance, network partitions become inevitable, and you must choose.” Then I would ask them what their system does. If it is a banking ledger, they need consistency (CP) — better to reject a transaction than process it twice. If it is a social media feed, they need availability (AP) — showing a slightly stale feed is better than showing an error page.
  • In practice, most modern systems do not make a binary CP/AP choice. They use tunable consistency — strong consistency for critical paths (checkout, payment) and eventual consistency for less critical paths (recommendation updates, analytics). DynamoDB lets you choose per-read: strongly consistent reads cost 2x the capacity but guarantee freshness. Most teams use eventually consistent reads for 95% of operations and pay for strong reads only where correctness demands it.
  • I would also mention that the PACELC theorem extends CAP: even when the network is operating normally (no Partition), you still face a trade-off between Latency and Consistency. Strong consistency requires coordination (consensus protocols, quorum writes), which adds latency. This is the real everyday trade-off — partitions are rare, but the latency-consistency trade-off affects every single request.
Follow-up: Your system uses eventual consistency, and a customer sees their order as “confirmed” on one page but “not found” on another. How do you handle this?This is a classic read-your-writes consistency violation. The write went to the primary, but the read hit a replica that has not caught up yet. Several solutions: (1) Read-your-writes consistency — after a user performs a write, pin their subsequent reads to the primary for a short window (5-10 seconds). This is the simplest fix and works for most cases. (2) Include a version token or timestamp in the write response and pass it with subsequent reads. The read handler checks if the replica has caught up to that version; if not, it falls back to the primary. DynamoDB and Cassandra support this pattern. (3) Client-side optimistic update — the frontend shows the confirmed order immediately from local state without waiting for the backend round-trip, while the backend processes asynchronously. Even if the backend read is stale, the user sees the correct state. (4) For truly critical flows like order confirmation, bypass the replica entirely and always read from the primary. The latency cost is minimal for operations that happen a few times per user session.
Strong Answer:
  • This is almost certainly a race condition. The symptoms — intermittent, load-dependent, non-reproducible in isolation — are the classic fingerprint. Under heavy load, the thread scheduler runs faster context switches, which makes the timing window for the race condition more likely to be hit.
  • Step 1: I would identify the shared mutable state. Race conditions require shared state that at least one thread is writing. I would audit the code for global variables, class-level variables, shared caches, or mutable objects passed between threads. In Python, even a simple counter += 1 is not atomic — it is a read-modify-write sequence that can be interleaved.
  • Step 2: I would add thread-safety annotations or use a race condition detector. For Java, ThreadSanitizer (TSan) can instrument the code to detect data races at runtime. For Python, I would use the threading module’s logging to trace lock acquisition order and identify unprotected critical sections.
  • Step 3: If the race condition is hard to reproduce, I would add deliberate delays (strategic time.sleep(0.001) calls) around suspected critical sections to widen the timing window and make the bug more reproducible. This is counterintuitive — making the code slower to make the bug more frequent — but it is an effective technique.
  • Step 4: Fix by protecting the critical section with the appropriate synchronization primitive. A Lock/Mutex for simple mutual exclusion. A ReadWriteLock if reads vastly outnumber writes and you want to allow concurrent reads. An atomic operation or CAS (Compare-And-Swap) if the shared state is a single value and you want lock-free performance.
  • Step 5: Verify the fix under the same heavy load conditions that triggered the original bug. Run the load test for 10x longer than the typical reproduction time to increase confidence. Race conditions are probabilistic — a short test passing does not prove the fix works.
Follow-up: You suspect a deadlock rather than a race condition. How would you confirm this and what is your fix?Deadlocks have a different signature: instead of incorrect results, threads hang indefinitely. In Java, I would take a thread dump (jstack <pid> or kill -3 <pid>) which explicitly shows deadlocks — it will say “Found one Java-level deadlock” and show the lock cycle. In Python, I would use faulthandler.dump_traceback() or attach a debugger. The thread dump shows which locks each thread holds and which lock it is waiting for, revealing the circular dependency. The fix depends on the cause: if two threads acquire locks A and B in opposite order, I enforce a global lock ordering — every thread acquires A before B, everywhere in the codebase. This breaks the circular-wait Coffman condition. If lock ordering is impractical (locks are dynamically determined), I switch to try-lock with timeout: attempt to acquire the lock for N seconds, and if it fails, release all held locks, back off with a randomized delay, and retry. This breaks the no-preemption condition.
Strong Answer:
  • TCP guarantees reliability through four mechanisms working together: (1) Sequence numbers — every byte in the stream gets a sequence number, so the receiver can detect missing or out-of-order data and reassemble it correctly. (2) Acknowledgments — the receiver sends ACKs confirming which bytes it has received. If the sender does not receive an ACK within a timeout, it retransmits the data. (3) Checksums — every segment includes a checksum that detects corruption in transit. Corrupted segments are silently discarded and the retransmission mechanism handles recovery. (4) Flow control via the sliding window — the receiver advertises how much buffer space it has, preventing the sender from overwhelming a slow receiver.
  • TCP also handles congestion control: algorithms like TCP Reno, CUBIC, and BBR detect network congestion (packet loss, increasing RTT) and reduce the sending rate to avoid making the congestion worse. This is why a large file download starts slow and speeds up — TCP is probing the network capacity.
  • I would choose UDP for real-time multiplayer gaming. In a game running at 60 FPS, you send player position updates 60 times per second. If packet 45 is lost, by the time TCP retransmits it, packets 46-50 have already arrived with newer position data. The retransmitted packet 45 is stale and useless — the player has moved since then. UDP lets you simply drop the lost packet and process the next one. You implement your own “good enough” reliability: you send the current state with every packet so any single packet contains enough information to reconstruct the game state.
  • Other strong UDP use cases: live video streaming (a dropped frame is better than a frozen stream), DNS lookups (a single request-response that fits in one packet — TCP handshake overhead is wasteful), and VoIP (a brief audio glitch is better than a pause while TCP retransmits). In all these cases, the application can tolerate data loss but cannot tolerate the latency that TCP’s retransmission adds.
Follow-up: HTTP/3 uses QUIC, which is built on UDP. Does that mean UDP is becoming the new standard for reliable communication?Not exactly. QUIC reimplements reliability ON TOP of UDP — it has its own sequence numbers, ACKs, and retransmission logic. The reason QUIC uses UDP instead of building on TCP is to solve TCP’s head-of-line blocking problem. In HTTP/2 over TCP, if one packet in the TCP stream is lost, ALL multiplexed HTTP streams on that connection are blocked while TCP retransmits that single packet — even if the lost packet belongs to stream 3 and streams 1, 2, and 4 are complete. QUIC implements independent streams at the transport layer, so a lost packet for stream 3 only blocks stream 3. QUIC also has faster connection establishment — 0-RTT handshake in the common case versus TCP+TLS’s 2-3 round trips. So QUIC is not “UDP replacing TCP” — it is a new reliable protocol that happens to use UDP as its transport because deploying a new transport protocol through the internet’s existing middleboxes (NATs, firewalls) is nearly impossible, but UDP passes through everything.