Skip to main content

Overview

Understanding computer science fundamentals helps you write better code, debug efficiently, and make informed architectural decisions. These concepts form the foundation for system design, performance optimization, and debugging complex issues.
Why This Matters: In interviews at top tech companies, 40-60% of system design questions require deep understanding of these fundamentals.

Operating Systems

Process vs Thread

Process

  • Independent execution unit
  • Own memory space (isolated)
  • Higher overhead to create (~10ms)
  • Inter-process communication needed
  • Crash isolation (one crash doesn’t affect others)

Thread

  • Lightweight execution unit
  • Shared memory with parent process
  • Lower overhead to create (~1ms)
  • Direct memory sharing
  • One thread crash can crash entire process

When to Use Processes vs Threads

ScenarioUse ProcessUse Thread
Isolation needed✅ Browser tabs
Shared state✅ Web server workers
CPU-bound tasks✅ Parallel processing✅ (with limitations)
I/O-bound tasks✅ Database connections
Fault tolerance✅ Microservices
# Process example (Python)
from multiprocessing import Process

def cpu_intensive_task(n):
    return sum(i * i for i in range(n))

# Creates separate memory space
p = Process(target=cpu_intensive_task, args=(10000000,))
p.start()
p.join()

# Thread example
from threading import Thread

def io_bound_task(url):
    response = requests.get(url)
    return response.text

# Shares memory with main thread
t = Thread(target=io_bound_task, args=("https://api.example.com",))
t.start()
t.join()

Memory Management

┌─────────────────────────────────────┐
│           Virtual Memory            │
├─────────────────────────────────────┤
│  Stack (grows down)                 │
│    ↓  Local variables, function    │
│       calls, return addresses       │
│  ... (free space) ...               │
│    ↑  Dynamically allocated        │
│       memory (malloc, new)          │
│  Heap (grows up)                    │
├─────────────────────────────────────┤
│  BSS (uninitialized data)           │
│    Global/static vars (zero-init)   │
├─────────────────────────────────────┤
│  Data (initialized data)            │
│    Global/static vars with values   │
├─────────────────────────────────────┤
│  Text (code)                        │
│    Executable instructions          │
└─────────────────────────────────────┘

Stack vs Heap Memory

AspectStackHeap
AllocationAutomatic (LIFO)Manual (malloc/new)
SpeedVery fastSlower (fragmentation)
SizeLimited (~1-8 MB)Limited by RAM
ScopeFunction-localGlobal access via pointers
Thread SafetyEach thread has own stackShared, needs synchronization
Memory ErrorsStack overflowMemory leaks, dangling pointers
// Stack allocation - fast, automatic cleanup
void stackExample() {
    int x = 10;           // Stack
    int arr[100];         // Stack (fixed size)
}  // Memory freed when function returns

// Heap allocation - flexible, manual cleanup
void heapExample() {
    int* ptr = malloc(sizeof(int) * 1000);  // Heap
    // ... use ptr ...
    free(ptr);  // Must manually free!
}

Concurrency Fundamentals

Deadlock Conditions (All 4 must be present)

  1. Mutual Exclusion: Resource can’t be shared
  2. Hold and Wait: Process holds resource while waiting for another
  3. No Preemption: Can’t force process to release resource
  4. Circular Wait: Circular chain of processes waiting
# Deadlock example
import threading

lock_a = threading.Lock()
lock_b = threading.Lock()

def thread_1():
    lock_a.acquire()
    time.sleep(0.1)  # Increases chance of deadlock
    lock_b.acquire()  # Waits forever if thread_2 has lock_b
    # ...
    
def thread_2():
    lock_b.acquire()
    time.sleep(0.1)
    lock_a.acquire()  # Waits forever if thread_1 has lock_a
    # ...

# Prevention: Always acquire locks in same order
def thread_safe():
    lock_a.acquire()  # Always acquire A first
    lock_b.acquire()  # Then B
    # ...
    lock_b.release()
    lock_a.release()

Race Condition

# Race condition example
counter = 0

def increment():
    global counter
    for _ in range(100000):
        counter += 1  # NOT atomic! Read-modify-write

# Two threads, expected: 200000, actual: random lower number

# Fix with Lock
lock = threading.Lock()

def safe_increment():
    global counter
    for _ in range(100000):
        with lock:
            counter += 1  # Now atomic

Key Concepts

ConceptDescriptionInterview Relevance
DeadlockCircular wait for resourcesHigh
Race ConditionUnpredictable behavior from timingVery High
MutexBinary lock (0 or 1)High
SemaphoreCounter-based lock (0 to N)High
Virtual MemoryAbstraction over physical memoryMedium
Context SwitchingSaving/restoring process stateMedium
Page FaultAccessing memory not in RAMMedium
ThrashingExcessive paging, system slowdownMedium

Process Scheduling Algorithms

AlgorithmDescriptionProsCons
FCFSFirst Come First ServedSimpleConvoy effect
SJFShortest Job FirstOptimal avg waitStarvation possible
Round RobinTime quantum rotationFairHigh context switching
PriorityHigher priority firstImportant tasks fastStarvation
Multilevel QueueMultiple queues with priorityFlexibleComplex

Networking

OSI Model (Simplified)

Layer 7: Application  (HTTP, FTP, DNS, SMTP)
        └── What the user interacts with
        
Layer 4: Transport    (TCP, UDP)
        └── End-to-end communication, ports
        
Layer 3: Network      (IP, ICMP, Routing)
        └── Logical addressing, routing between networks
        
Layer 2: Data Link    (Ethernet, MAC, Switches)
        └── Physical addressing, frame transmission
        
Layer 1: Physical     (Cables, Signals, Hubs)
        └── Raw bit transmission

TCP Three-Way Handshake

Client                    Server
  │                         │
  │───── SYN (seq=x) ──────►│  1. Client initiates
  │                         │
  │◄── SYN-ACK (seq=y,      │  2. Server acknowledges
  │     ack=x+1) ───────────│     and sends its own SYN
  │                         │
  │───── ACK (ack=y+1) ────►│  3. Client acknowledges
  │                         │
  │◄═══ Connection ════════►│  4. Ready to transfer data

TCP vs UDP

TCP

  • Connection-oriented (handshake)
  • Reliable delivery (ACKs, retransmission)
  • Ordered packets (sequence numbers)
  • Flow control (sliding window)
  • Congestion control
  • Use: Web, Email, File transfer, SSH

UDP

  • Connectionless (no handshake)
  • Best-effort delivery (no guarantees)
  • No ordering guarantee
  • No flow/congestion control
  • Lower latency, less overhead
  • Use: Gaming, Streaming, DNS, VoIP

DNS Resolution Process

1. Browser Cache      → Check local browser cache
2. OS Cache           → Check operating system DNS cache
3. Resolver Cache     → ISP's recursive resolver cache
4. Root Server        → "Who handles .com?"
5. TLD Server         → "Who handles example.com?"
6. Authoritative NS   → "example.com = 93.184.216.34"
# Trace DNS resolution
nslookup example.com
dig example.com +trace

HTTP/1.1 vs HTTP/2 vs HTTP/3

FeatureHTTP/1.1HTTP/2HTTP/3
ConnectionMultiple TCPSingle TCP, multiplexedQUIC (UDP-based)
Head-of-line blockingYesAt TCP levelNo
Header compressionNoHPACKQPACK
Server pushNoYesYes
EncryptionOptionalPractically requiredRequired (TLS 1.3)

HTTP Methods & Status Codes

# Common HTTP Methods
GET     # Retrieve resource (idempotent, cacheable)
POST    # Create resource (not idempotent)
PUT     # Update/Replace resource (idempotent)
PATCH   # Partial update (not necessarily idempotent)
DELETE  # Remove resource (idempotent)
HEAD    # GET without body (check if resource exists)
OPTIONS # Supported methods (CORS preflight)

# Status Code Categories
1xx  # Informational (100 Continue)
2xx  # Success (200 OK, 201 Created, 204 No Content)
3xx  # Redirection (301 Moved Permanently, 302 Found, 304 Not Modified)
4xx  # Client Error (400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 429 Too Many Requests)
5xx  # Server Error (500 Internal, 502 Bad Gateway, 503 Service Unavailable, 504 Gateway Timeout)

WebSocket vs HTTP

HTTP: Request-Response (Half-duplex)
Client ───Request──► Server
Client ◄──Response── Server
Client ───Request──► Server
Client ◄──Response── Server

WebSocket: Full-duplex, persistent connection
Client ◄═════════════► Server
       │ Real-time    │
       │ bidirectional│
       │ messages     │

Database Fundamentals

ACID Properties

PropertyDescriptionExample
AtomicityAll or nothingBank transfer: debit AND credit succeed or both fail
ConsistencyValid state transitionsBalance can’t be negative, referential integrity
IsolationConcurrent transactions don’t interfereRead committed, serializable
DurabilityCommitted data persistsWrite-ahead logging, survives crashes

Transaction Isolation Levels

LevelDirty ReadNon-Repeatable ReadPhantom ReadPerformance
Read Uncommitted✅ Possible✅ Possible✅ PossibleFastest
Read Committed❌ Prevented✅ Possible✅ PossibleFast
Repeatable Read❌ Prevented❌ Prevented✅ PossibleMedium
Serializable❌ Prevented❌ Prevented❌ PreventedSlowest
-- Set isolation level
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

BEGIN TRANSACTION;
-- Your queries here
COMMIT;

SQL vs NoSQL

SQL (Relational)

  • Structured schema (tables, rows, columns)
  • ACID compliant
  • Complex queries (JOINs, aggregations)
  • Vertical scaling (scale up)
  • Examples: PostgreSQL, MySQL, Oracle
  • Use: Transactions, Analytics, Complex relationships

NoSQL

  • Flexible schema (documents, key-value, graphs)
  • BASE (Eventually consistent)
  • Simple queries (by key/ID)
  • Horizontal scaling (scale out)
  • Examples: MongoDB, Redis, Cassandra, Neo4j
  • Use: Real-time, Big Data, Caching, Social graphs

NoSQL Types

TypeData ModelUse CaseExample
Key-ValueKey → ValueCaching, SessionsRedis, DynamoDB
DocumentJSON documentsContent management, CatalogsMongoDB, Couchbase
Column-FamilyWide columnsTime-series, AnalyticsCassandra, HBase
GraphNodes + EdgesSocial networks, RecommendationsNeo4j, Amazon Neptune

Indexing Deep Dive

-- B-Tree Index (default for most DBs)
-- Good for: range queries, sorting, equality, prefix matching
CREATE INDEX idx_user_email ON users(email);

-- Hash Index
-- Good for: equality comparisons ONLY (not range queries)
CREATE INDEX idx_user_id ON users USING HASH(id);

-- Composite Index (order matters!)
-- Follows "leftmost prefix" rule
CREATE INDEX idx_orders ON orders(user_id, status, created_at);
-- ✅ Uses index: WHERE user_id = 1
-- ✅ Uses index: WHERE user_id = 1 AND status = 'pending'
-- ✅ Uses index: WHERE user_id = 1 AND status = 'pending' AND created_at > '2024-01-01'
-- ❌ Skips index: WHERE status = 'pending' (missing leftmost column)
-- ❌ Skips index: WHERE user_id = 1 AND created_at > '2024-01-01' (gap in middle)

-- Covering Index (includes all needed columns)
CREATE INDEX idx_users_covering ON users(email) INCLUDE (name, created_at);
-- Query can be answered entirely from index, no table lookup needed

-- Partial Index (index subset of rows)
CREATE INDEX idx_active_users ON users(email) WHERE status = 'active';

When NOT to Use Indexes

  • High write tables: Each index slows down INSERT/UPDATE/DELETE
  • Small tables: Full table scan may be faster
  • Low cardinality columns: boolean, status with few values
  • Frequently updated columns: Index maintenance overhead

Database Normalization

Normal FormRuleExample Violation
1NFNo repeating groups, atomic valuesphone: "123,456,789"
2NFNo partial dependenciesNon-key depends on part of composite key
3NFNo transitive dependencieszipcitystate
BCNFEvery determinant is a candidate keyMore strict 3NF
-- Denormalized (violation of 1NF)
CREATE TABLE orders (
    id INT,
    products VARCHAR(255)  -- "laptop,mouse,keyboard"
);

-- Normalized
CREATE TABLE orders (id INT PRIMARY KEY);
CREATE TABLE order_items (
    order_id INT REFERENCES orders(id),
    product_id INT REFERENCES products(id),
    quantity INT
);

Computer Architecture

CPU Cache Hierarchy

┌─────────────────────────────────────────────┐
│                   CPU                       │
│  ┌─────────────────────────────────────┐   │
│  │           Core                      │   │
│  │  ┌─────────────────────────────┐   │   │
│  │  │   L1 Cache (32-64KB)       │   │   │  ~1ns
│  │  │   Fastest, per-core        │   │   │
│  │  └─────────────────────────────┘   │   │
│  │  ┌─────────────────────────────┐   │   │
│  │  │   L2 Cache (256KB-1MB)     │   │   │  ~3-10ns
│  │  │   Per-core                 │   │   │
│  │  └─────────────────────────────┘   │   │
│  └─────────────────────────────────────┘   │
│  ┌─────────────────────────────────────┐   │
│  │   L3 Cache (8-64MB)                │   │  ~10-40ns
│  │   Shared across cores              │   │
│  └─────────────────────────────────────┘   │
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│   RAM (8-128GB+)                           │  ~100ns
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│   SSD (~1TB)                               │  ~100μs
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│   HDD (~10TB)                              │  ~10ms
└─────────────────────────────────────────────┘

Latency Numbers Every Programmer Should Know

OperationTime
L1 cache reference0.5 ns
Branch mispredict5 ns
L2 cache reference7 ns
Mutex lock/unlock25 ns
Main memory reference100 ns
Compress 1KB with Zippy3,000 ns (3 μs)
Send 1KB over 1 Gbps network10,000 ns (10 μs)
Read 4KB randomly from SSD150,000 ns (150 μs)
Round trip within datacenter500,000 ns (0.5 ms)
Read 1 MB sequentially from SSD1,000,000 ns (1 ms)
Disk seek10,000,000 ns (10 ms)
Read 1 MB sequentially from disk20,000,000 ns (20 ms)
Send packet CA → Netherlands → CA150,000,000 ns (150 ms)

Distributed Systems Basics

CAP Theorem

In a distributed system, you can only guarantee 2 of 3: Consistency, Availability, Partition Tolerance
         Consistency
            /\
           /  \
          /    \
         /  CA  \
        /________\
       /\        /\
      /  \  CP  /  \
     / AP \    /    \
    /______\  /______\
Availability  Partition
              Tolerance
ChoiceTrade-offExamples
CPSacrifice availability for consistencyMongoDB, HBase, Redis Cluster
APSacrifice consistency for availabilityCassandra, DynamoDB, CouchDB
CANot practical (network partitions happen)Traditional RDBMS (single node)

Consistency Models

ModelDescriptionUse Case
StrongAll reads see latest writeBanking, Inventory
EventualReads eventually see latest writeSocial media feeds
CausalRespects cause-effect orderingMessaging apps
Read-your-writesUser sees their own writes immediatelyUser profile updates

Practice Questions

  1. Browser checks caches: Browser cache, OS cache, router cache
  2. DNS resolution: Resolve domain to IP address
    • Check local DNS cache
    • Query recursive DNS resolver
    • Query root → TLD → authoritative nameserver
  3. TCP connection: 3-way handshake (SYN → SYN-ACK → ACK)
  4. TLS handshake (if HTTPS):
    • Client Hello (supported ciphers)
    • Server Hello + Certificate
    • Key exchange
    • Encrypted connection established
  5. HTTP request sent: GET / HTTP/1.1 with headers
  6. Server processes request:
    • Load balancer routes to server
    • Server processes and queries database
    • Generates response
  7. HTTP response received: HTML content with status code
  8. Browser renders:
    • Parse HTML → DOM tree
    • Parse CSS → CSSOM
    • Execute JavaScript
    • Render pixels to screen
Indexes are data structures (usually B-trees) that maintain sorted references to rows.B-Tree Structure:
  • Balanced tree with sorted keys
  • Each node can have multiple children
  • Leaf nodes contain pointers to actual rows
How lookup works:
  • Instead of scanning all rows O(n)
  • Binary search through tree O(log n)
  • Follow pointers to find matching rows
Trade-offs:
  • Faster reads (logarithmic lookup)
  • Slower writes (must update index)
  • Additional storage space
Deadlock occurs when: Four conditions are ALL met (Coffman conditions):
  1. Mutual exclusion - resource can’t be shared
  2. Hold and wait - holding one resource, waiting for another
  3. No preemption - can’t force release
  4. Circular wait - A waits for B, B waits for A
Prevention strategies:
  • Lock ordering: Always acquire locks in same global order
  • Lock timeout: Give up after waiting too long
  • Deadlock detection: Detect and kill one transaction
  • Try-lock: Non-blocking attempt, back off if fails
  • Avoid nested locks: Minimize lock scope
Process:
  • Independent execution unit with own memory space
  • Isolated - crash doesn’t affect other processes
  • Higher creation/switching overhead (~10ms)
  • Communication via IPC (pipes, sockets, shared memory)
Thread:
  • Lightweight execution unit within a process
  • Shared memory with parent and sibling threads
  • Lower creation/switching overhead (~1ms)
  • Direct memory sharing (but needs synchronization)
When to use which:
  • Processes: Isolation needed, fault tolerance, security
  • Threads: Shared state, I/O-bound tasks, lower overhead needed
TCP (Transmission Control Protocol):
  • Connection-oriented (handshake required)
  • Reliable (acknowledgments, retransmission)
  • Ordered (sequence numbers)
  • Flow control (sliding window)
  • Congestion control
  • Use: Web, email, file transfer, SSH
UDP (User Datagram Protocol):
  • Connectionless (no handshake)
  • Unreliable (no guarantees)
  • Unordered (packets may arrive out of order)
  • No flow/congestion control
  • Lower latency, less overhead
  • Use: Gaming, video streaming, DNS, VoIP
Trade-off: Reliability vs Speed
Read Uncommitted: Can see uncommitted changes (dirty reads)Read Committed: Only see committed changes, but same query may return different resultsRepeatable Read: Same query returns same results within transaction, but new rows may appear (phantoms)Serializable: Transactions execute as if sequential - no anomalies, but slowestProblems prevented by level:
  • Dirty read: See uncommitted data
  • Non-repeatable read: Same query, different results
  • Phantom read: New rows appear matching query

Quick Reference Card

Numbers to Remember

MetricValue
L1 cache~1 ns
L2 cache~10 ns
RAM~100 ns
SSD random read~150 μs
HDD seek~10 ms
Same datacenter RTT~0.5 ms
Cross-region RTT~100-150 ms

Common Ports

PortService
22SSH
80HTTP
443HTTPS
3306MySQL
5432PostgreSQL
6379Redis
27017MongoDB
Interview Tip: Be ready to explain these concepts with real-world examples. Interviewers love when you can relate theory to practical scenarios. Use the “explain like I’m 5” technique to show deep understanding.