Computer Science Fundamentals

Overview

Understanding computer science fundamentals helps you write better code, debug efficiently, and make informed architectural decisions. These concepts form the foundation for system design, performance optimization, and debugging complex issues.

Why This Matters: In interviews at top tech companies, 40-60% of system design questions require deep understanding of these fundamentals.

Operating Systems

Process vs Thread

Process

Independent execution unit
Own memory space (isolated)
Higher overhead to create (~10ms)
Inter-process communication needed
Crash isolation (one crash doesn’t affect others)

Thread

Lightweight execution unit
Shared memory with parent process
Lower overhead to create (~1ms)
Direct memory sharing
One thread crash can crash entire process

When to Use Processes vs Threads

Scenario	Use Process	Use Thread
Isolation needed	✅ Browser tabs	❌
Shared state	❌	✅ Web server workers
CPU-bound tasks	✅ Parallel processing	✅ (with limitations)
I/O-bound tasks	❌	✅ Database connections
Fault tolerance	✅ Microservices	❌

# Process example (Python)
from multiprocessing import Process

def cpu_intensive_task(n):
    return sum(i * i for i in range(n))

# Creates separate memory space
p = Process(target=cpu_intensive_task, args=(10000000,))
p.start()
p.join()

# Thread example
from threading import Thread

def io_bound_task(url):
    response = requests.get(url)
    return response.text

# Shares memory with main thread
t = Thread(target=io_bound_task, args=("https://api.example.com",))
t.start()
t.join()

Memory Management

┌─────────────────────────────────────┐
│           Virtual Memory            │
├─────────────────────────────────────┤
│  Stack (grows down)                 │
│    ↓  Local variables, function    │
│       calls, return addresses       │
│  ... (free space) ...               │
│    ↑  Dynamically allocated        │
│       memory (malloc, new)          │
│  Heap (grows up)                    │
├─────────────────────────────────────┤
│  BSS (uninitialized data)           │
│    Global/static vars (zero-init)   │
├─────────────────────────────────────┤
│  Data (initialized data)            │
│    Global/static vars with values   │
├─────────────────────────────────────┤
│  Text (code)                        │
│    Executable instructions          │
└─────────────────────────────────────┘

Stack vs Heap Memory

Aspect	Stack	Heap
Allocation	Automatic (LIFO)	Manual (malloc/new)
Speed	Very fast	Slower (fragmentation)
Size	Limited (~1-8 MB)	Limited by RAM
Scope	Function-local	Global access via pointers
Thread Safety	Each thread has own stack	Shared, needs synchronization
Memory Errors	Stack overflow	Memory leaks, dangling pointers

// Stack allocation - fast, automatic cleanup
void stackExample() {
    int x = 10;           // Stack
    int arr[100];         // Stack (fixed size)
}  // Memory freed when function returns

// Heap allocation - flexible, manual cleanup
void heapExample() {
    int* ptr = malloc(sizeof(int) * 1000);  // Heap
    // ... use ptr ...
    free(ptr);  // Must manually free!
}

Concurrency Fundamentals

Deadlock Conditions (All 4 must be present)

Mutual Exclusion: Resource can’t be shared
Hold and Wait: Process holds resource while waiting for another
No Preemption: Can’t force process to release resource
Circular Wait: Circular chain of processes waiting

# Deadlock example
import threading

lock_a = threading.Lock()
lock_b = threading.Lock()

def thread_1():
    lock_a.acquire()
    time.sleep(0.1)  # Increases chance of deadlock
    lock_b.acquire()  # Waits forever if thread_2 has lock_b
    # ...
    
def thread_2():
    lock_b.acquire()
    time.sleep(0.1)
    lock_a.acquire()  # Waits forever if thread_1 has lock_a
    # ...

# Prevention: Always acquire locks in same order
def thread_safe():
    lock_a.acquire()  # Always acquire A first
    lock_b.acquire()  # Then B
    # ...
    lock_b.release()
    lock_a.release()

Race Condition

# Race condition example
counter = 0

def increment():
    global counter
    for _ in range(100000):
        counter += 1  # NOT atomic! Read-modify-write

# Two threads, expected: 200000, actual: random lower number

# Fix with Lock
lock = threading.Lock()

def safe_increment():
    global counter
    for _ in range(100000):
        with lock:
            counter += 1  # Now atomic

Key Concepts

Concept	Description	Interview Relevance
Deadlock	Circular wait for resources	High
Race Condition	Unpredictable behavior from timing	Very High
Mutex	Binary lock (0 or 1)	High
Semaphore	Counter-based lock (0 to N)	High
Virtual Memory	Abstraction over physical memory	Medium
Context Switching	Saving/restoring process state	Medium
Page Fault	Accessing memory not in RAM	Medium
Thrashing	Excessive paging, system slowdown	Medium

Process Scheduling Algorithms

Algorithm	Description	Pros	Cons
FCFS	First Come First Served	Simple	Convoy effect
SJF	Shortest Job First	Optimal avg wait	Starvation possible
Round Robin	Time quantum rotation	Fair	High context switching
Priority	Higher priority first	Important tasks fast	Starvation
Multilevel Queue	Multiple queues with priority	Flexible	Complex

Networking

OSI Model (Simplified)

Layer 7: Application  (HTTP, FTP, DNS, SMTP)
        └── What the user interacts with
        
Layer 4: Transport    (TCP, UDP)
        └── End-to-end communication, ports
        
Layer 3: Network      (IP, ICMP, Routing)
        └── Logical addressing, routing between networks
        
Layer 2: Data Link    (Ethernet, MAC, Switches)
        └── Physical addressing, frame transmission
        
Layer 1: Physical     (Cables, Signals, Hubs)
        └── Raw bit transmission

TCP Three-Way Handshake

Client                    Server
  │                         │
  │───── SYN (seq=x) ──────►│  1. Client initiates
  │                         │
  │◄── SYN-ACK (seq=y,      │  2. Server acknowledges
  │     ack=x+1) ───────────│     and sends its own SYN
  │                         │
  │───── ACK (ack=y+1) ────►│  3. Client acknowledges
  │                         │
  │◄═══ Connection ════════►│  4. Ready to transfer data

TCP vs UDP

TCP

Connection-oriented (handshake)
Reliable delivery (ACKs, retransmission)
Ordered packets (sequence numbers)
Flow control (sliding window)
Congestion control
Use: Web, Email, File transfer, SSH

UDP

Connectionless (no handshake)
Best-effort delivery (no guarantees)
No ordering guarantee
No flow/congestion control
Lower latency, less overhead
Use: Gaming, Streaming, DNS, VoIP

DNS Resolution Process

Browser Cache      → Check local browser cache
OS Cache           → Check operating system DNS cache
Resolver Cache     → ISP's recursive resolver cache
Root Server        → "Who handles .com?"
TLD Server         → "Who handles example.com?"
Authoritative NS   → "example.com = 93.184.216.34"

# Trace DNS resolution
nslookup example.com
dig example.com +trace

HTTP/1.1 vs HTTP/2 vs HTTP/3

Feature	HTTP/1.1	HTTP/2	HTTP/3
Connection	Multiple TCP	Single TCP, multiplexed	QUIC (UDP-based)
Head-of-line blocking	Yes	At TCP level	No
Header compression	No	HPACK	QPACK
Server push	No	Yes	Yes
Encryption	Optional	Practically required	Required (TLS 1.3)

HTTP Methods & Status Codes

# Common HTTP Methods
GET     # Retrieve resource (idempotent, cacheable)
POST    # Create resource (not idempotent)
PUT     # Update/Replace resource (idempotent)
PATCH   # Partial update (not necessarily idempotent)
DELETE  # Remove resource (idempotent)
HEAD    # GET without body (check if resource exists)
OPTIONS # Supported methods (CORS preflight)

# Status Code Categories
1xx  # Informational (100 Continue)
2xx  # Success (200 OK, 201 Created, 204 No Content)
3xx  # Redirection (301 Moved Permanently, 302 Found, 304 Not Modified)
4xx  # Client Error (400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 429 Too Many Requests)
5xx  # Server Error (500 Internal, 502 Bad Gateway, 503 Service Unavailable, 504 Gateway Timeout)

WebSocket vs HTTP

HTTP: Request-Response (Half-duplex)
Client ───Request──► Server
Client ◄──Response── Server
Client ───Request──► Server
Client ◄──Response── Server

WebSocket: Full-duplex, persistent connection
Client ◄═════════════► Server
       │ Real-time    │
       │ bidirectional│
       │ messages     │

Database Fundamentals

ACID Properties

Property	Description	Example
Atomicity	All or nothing	Bank transfer: debit AND credit succeed or both fail
Consistency	Valid state transitions	Balance can’t be negative, referential integrity
Isolation	Concurrent transactions don’t interfere	Read committed, serializable
Durability	Committed data persists	Write-ahead logging, survives crashes

Transaction Isolation Levels

Level	Dirty Read	Non-Repeatable Read	Phantom Read	Performance
Read Uncommitted	✅ Possible	✅ Possible	✅ Possible	Fastest
Read Committed	❌ Prevented	✅ Possible	✅ Possible	Fast
Repeatable Read	❌ Prevented	❌ Prevented	✅ Possible	Medium
Serializable	❌ Prevented	❌ Prevented	❌ Prevented	Slowest

-- Set isolation level
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

BEGIN TRANSACTION;
-- Your queries here
COMMIT;

SQL vs NoSQL

SQL (Relational)

Structured schema (tables, rows, columns)
ACID compliant
Complex queries (JOINs, aggregations)
Vertical scaling (scale up)
Examples: PostgreSQL, MySQL, Oracle
Use: Transactions, Analytics, Complex relationships

NoSQL

Flexible schema (documents, key-value, graphs)
BASE (Eventually consistent)
Simple queries (by key/ID)
Horizontal scaling (scale out)
Examples: MongoDB, Redis, Cassandra, Neo4j
Use: Real-time, Big Data, Caching, Social graphs

NoSQL Types

Type	Data Model	Use Case	Example
Key-Value	Key → Value	Caching, Sessions	Redis, DynamoDB
Document	JSON documents	Content management, Catalogs	MongoDB, Couchbase
Column-Family	Wide columns	Time-series, Analytics	Cassandra, HBase
Graph	Nodes + Edges	Social networks, Recommendations	Neo4j, Amazon Neptune

Indexing Deep Dive

-- B-Tree Index (default for most DBs)
-- Good for: range queries, sorting, equality, prefix matching
CREATE INDEX idx_user_email ON users(email);

-- Hash Index
-- Good for: equality comparisons ONLY (not range queries)
CREATE INDEX idx_user_id ON users USING HASH(id);

-- Composite Index (order matters!)
-- Follows "leftmost prefix" rule
CREATE INDEX idx_orders ON orders(user_id, status, created_at);
-- ✅ Uses index: WHERE user_id = 1
-- ✅ Uses index: WHERE user_id = 1 AND status = 'pending'
-- ✅ Uses index: WHERE user_id = 1 AND status = 'pending' AND created_at > '2024-01-01'
-- ❌ Skips index: WHERE status = 'pending' (missing leftmost column)
-- ❌ Skips index: WHERE user_id = 1 AND created_at > '2024-01-01' (gap in middle)

-- Covering Index (includes all needed columns)
CREATE INDEX idx_users_covering ON users(email) INCLUDE (name, created_at);
-- Query can be answered entirely from index, no table lookup needed

-- Partial Index (index subset of rows)
CREATE INDEX idx_active_users ON users(email) WHERE status = 'active';

When NOT to Use Indexes

High write tables: Each index slows down INSERT/UPDATE/DELETE
Small tables: Full table scan may be faster
Low cardinality columns: boolean, status with few values
Frequently updated columns: Index maintenance overhead

Database Normalization

Normal Form	Rule	Example Violation
1NF	No repeating groups, atomic values	`phone: "123,456,789"`
2NF	No partial dependencies	Non-key depends on part of composite key
3NF	No transitive dependencies	`zip` → `city` → `state`
BCNF	Every determinant is a candidate key	More strict 3NF

-- Denormalized (violation of 1NF)
CREATE TABLE orders (
    id INT,
    products VARCHAR(255)  -- "laptop,mouse,keyboard"
);

-- Normalized
CREATE TABLE orders (id INT PRIMARY KEY);
CREATE TABLE order_items (
    order_id INT REFERENCES orders(id),
    product_id INT REFERENCES products(id),
    quantity INT
);

Computer Architecture

CPU Cache Hierarchy

┌─────────────────────────────────────────────┐
│                   CPU                       │
│  ┌─────────────────────────────────────┐   │
│  │           Core                      │   │
│  │  ┌─────────────────────────────┐   │   │
│  │  │   L1 Cache (32-64KB)       │   │   │  ~1ns
│  │  │   Fastest, per-core        │   │   │
│  │  └─────────────────────────────┘   │   │
│  │  ┌─────────────────────────────┐   │   │
│  │  │   L2 Cache (256KB-1MB)     │   │   │  ~3-10ns
│  │  │   Per-core                 │   │   │
│  │  └─────────────────────────────┘   │   │
│  └─────────────────────────────────────┘   │
│  ┌─────────────────────────────────────┐   │
│  │   L3 Cache (8-64MB)                │   │  ~10-40ns
│  │   Shared across cores              │   │
│  └─────────────────────────────────────┘   │
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│   RAM (8-128GB+)                           │  ~100ns
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│   SSD (~1TB)                               │  ~100μs
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│   HDD (~10TB)                              │  ~10ms
└─────────────────────────────────────────────┘

Latency Numbers Every Programmer Should Know

Operation	Time
L1 cache reference	0.5 ns
Branch mispredict	5 ns
L2 cache reference	7 ns
Mutex lock/unlock	25 ns
Main memory reference	100 ns
Compress 1KB with Zippy	3,000 ns (3 μs)
Send 1KB over 1 Gbps network	10,000 ns (10 μs)
Read 4KB randomly from SSD	150,000 ns (150 μs)
Round trip within datacenter	500,000 ns (0.5 ms)
Read 1 MB sequentially from SSD	1,000,000 ns (1 ms)
Disk seek	10,000,000 ns (10 ms)
Read 1 MB sequentially from disk	20,000,000 ns (20 ms)
Send packet CA → Netherlands → CA	150,000,000 ns (150 ms)

Distributed Systems Basics

CAP Theorem

In a distributed system, you can only guarantee 2 of 3: Consistency, Availability, Partition Tolerance

         Consistency
            /\
           /  \
          /    \
         /  CA  \
        /________\
       /\        /\
      /  \  CP  /  \
     / AP \    /    \
    /______\  /______\
Availability  Partition
              Tolerance

Choice	Trade-off	Examples
CP	Sacrifice availability for consistency	MongoDB, HBase, Redis Cluster
AP	Sacrifice consistency for availability	Cassandra, DynamoDB, CouchDB
CA	Not practical (network partitions happen)	Traditional RDBMS (single node)

Consistency Models

Model	Description	Use Case
Strong	All reads see latest write	Banking, Inventory
Eventual	Reads eventually see latest write	Social media feeds
Causal	Respects cause-effect ordering	Messaging apps
Read-your-writes	User sees their own writes immediately	User profile updates

Practice Questions

What happens when you type a URL in a browser?

Browser checks caches: Browser cache, OS cache, router cache
DNS resolution: Resolve domain to IP address
- Check local DNS cache
- Query recursive DNS resolver
- Query root → TLD → authoritative nameserver
TCP connection: 3-way handshake (SYN → SYN-ACK → ACK)
TLS handshake (if HTTPS):
- Client Hello (supported ciphers)
- Server Hello + Certificate
- Key exchange
- Encrypted connection established
HTTP request sent: GET / HTTP/1.1 with headers
Server processes request:
- Load balancer routes to server
- Server processes and queries database
- Generates response
HTTP response received: HTML content with status code
Browser renders:
- Parse HTML → DOM tree
- Parse CSS → CSSOM
- Execute JavaScript
- Render pixels to screen

How does a database index work?

Indexes are data structures (usually B-trees) that maintain sorted references to rows.B-Tree Structure:

Balanced tree with sorted keys
Each node can have multiple children
Leaf nodes contain pointers to actual rows

How lookup works:

Instead of scanning all rows O(n)
Binary search through tree O(log n)
Follow pointers to find matching rows

Trade-offs:

Faster reads (logarithmic lookup)
Slower writes (must update index)
Additional storage space

Explain deadlock and how to prevent it

Deadlock occurs when: Four conditions are ALL met (Coffman conditions):

Mutual exclusion - resource can’t be shared
Hold and wait - holding one resource, waiting for another
No preemption - can’t force release
Circular wait - A waits for B, B waits for A

Prevention strategies:

Lock ordering: Always acquire locks in same global order
Lock timeout: Give up after waiting too long
Deadlock detection: Detect and kill one transaction
Try-lock: Non-blocking attempt, back off if fails
Avoid nested locks: Minimize lock scope

Explain the difference between processes and threads

Process:

Independent execution unit with own memory space
Isolated - crash doesn’t affect other processes
Higher creation/switching overhead (~10ms)
Communication via IPC (pipes, sockets, shared memory)

Thread:

Lightweight execution unit within a process
Shared memory with parent and sibling threads
Lower creation/switching overhead (~1ms)
Direct memory sharing (but needs synchronization)

When to use which:

Processes: Isolation needed, fault tolerance, security
Threads: Shared state, I/O-bound tasks, lower overhead needed

What is the difference between TCP and UDP?

TCP (Transmission Control Protocol):

Connection-oriented (handshake required)
Reliable (acknowledgments, retransmission)
Ordered (sequence numbers)
Flow control (sliding window)
Congestion control
Use: Web, email, file transfer, SSH

UDP (User Datagram Protocol):

Connectionless (no handshake)
Unreliable (no guarantees)
Unordered (packets may arrive out of order)
No flow/congestion control
Lower latency, less overhead
Use: Gaming, video streaming, DNS, VoIP

Trade-off: Reliability vs Speed

Explain database transaction isolation levels

Read Uncommitted: Can see uncommitted changes (dirty reads)Read Committed: Only see committed changes, but same query may return different resultsRepeatable Read: Same query returns same results within transaction, but new rows may appear (phantoms)Serializable: Transactions execute as if sequential - no anomalies, but slowestProblems prevented by level:

Dirty read: See uncommitted data
Non-repeatable read: Same query, different results
Phantom read: New rows appear matching query

Quick Reference Card

Numbers to Remember

Metric	Value
L1 cache	~1 ns
L2 cache	~10 ns
RAM	~100 ns
SSD random read	~150 μs
HDD seek	~10 ms
Same datacenter RTT	~0.5 ms
Cross-region RTT	~100-150 ms

Common Ports

Port	Service
22	SSH
80	HTTP
443	HTTPS
3306	MySQL
5432	PostgreSQL
6379	Redis
27017	MongoDB

Interview Tip: Be ready to explain these concepts with real-world examples. Interviewers love when you can relate theory to practical scenarios. Use the “explain like I’m 5” technique to show deep understanding.

Overview

Testing & Code Quality

Crash Courses

AI Engineering

Math for ML - Understanding Linear Algebra

Probability & Statistics for ML

Math for ML - Understanding Calculus

ML Mastery

Deep Learning Mastery

NestJS Mastery

Microservices Mastery

Low Level Design

OOP Concepts

SOLID Principles

Design Patterns

LLD Case Studies

System Design (HLD)

Senior Level (L5+/Staff)

HLD Case Studies

Engineering Fundamentals

DevOps & Operations

Azure Cloud Engineering

AWS Cloud

AWS Monitoring & Observability

AWS Security Services

AWS Serverless

AWS Operations

AWS Advanced

AWS Case Studies

GCP Cloud Engineering

DevOps Tools

Database Engineering

HIPAA Compliance Mastery

Operating Systems

Linux Internals

Distributed Systems

Networking Mastery

Build Your Own X

Go Lang Mastery

C Programming

Classic Research Papers

Distributed System Tools

​Overview

​Operating Systems

​Process vs Thread

Process

Thread

​When to Use Processes vs Threads

​Memory Management

​Stack vs Heap Memory

​Concurrency Fundamentals

​Deadlock Conditions (All 4 must be present)

​Race Condition

​Key Concepts

​Process Scheduling Algorithms

​Networking

​OSI Model (Simplified)

​TCP Three-Way Handshake

​TCP vs UDP

TCP

UDP

​DNS Resolution Process

​HTTP/1.1 vs HTTP/2 vs HTTP/3

​HTTP Methods & Status Codes

​WebSocket vs HTTP

​Database Fundamentals

​ACID Properties

​Transaction Isolation Levels

​SQL vs NoSQL

SQL (Relational)

NoSQL

​NoSQL Types

​Indexing Deep Dive

​When NOT to Use Indexes

​Database Normalization

​Computer Architecture

​CPU Cache Hierarchy

​Latency Numbers Every Programmer Should Know

​Distributed Systems Basics

​CAP Theorem

Overview

Operating Systems

Process vs Thread

When to Use Processes vs Threads

Memory Management

Stack vs Heap Memory

Concurrency Fundamentals

Deadlock Conditions (All 4 must be present)

Race Condition

Key Concepts

Process Scheduling Algorithms

Networking

OSI Model (Simplified)

TCP Three-Way Handshake

TCP vs UDP

DNS Resolution Process

HTTP/1.1 vs HTTP/2 vs HTTP/3

HTTP Methods & Status Codes

WebSocket vs HTTP

Database Fundamentals

ACID Properties

Transaction Isolation Levels

SQL vs NoSQL

NoSQL Types

Indexing Deep Dive

When NOT to Use Indexes

Database Normalization

Computer Architecture

CPU Cache Hierarchy

Latency Numbers Every Programmer Should Know

Distributed Systems Basics

CAP Theorem

Consistency Models

Practice Questions

Quick Reference Card

Numbers to Remember

Common Ports