System Design Fundamentals

Quick Reference Card
Scalability
Vertical vs Horizontal Scaling
Latency vs Throughput
Latency Percentiles
Availability
The “Nines” of Availability
Achieving High Availability
CAP Theorem
Real-World Trade-offs
ACID vs BASE
ACID (Traditional Databases)
BASE (NoSQL Databases)
Consistency Patterns
Strong Consistency
Eventual Consistency
Read-Your-Writes Consistency
Back-of-the-Envelope Estimation
Common Calculations
Storage Estimation
Memory Estimation
Interview Questions on Fundamentals

Interview Essential: These fundamentals are the building blocks of every system design. Interviewers expect you to naturally incorporate these concepts without being asked.

Quick Reference Card

┌─────────────────────────────────────────────────────────────────┐
│              FUNDAMENTALS CHEAT SHEET                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  SCALING                                                        │
│  • Vertical = bigger machine (easy but limited)                 │
│  • Horizontal = more machines (complex but unlimited)           │
│                                                                 │
│  AVAILABILITY (memorize these!)                                 │
│  • 99.9% = 8.7 hours downtime/year                             │
│  • 99.99% = 52 minutes downtime/year                           │
│  • 99.999% = 5 minutes downtime/year                           │
│                                                                 │
│  CAP THEOREM                                                    │
│  • CP = Bank, inventory (consistency > availability)           │
│  • AP = Social media, cache (availability > consistency)       │
│                                                                 │
│  LATENCY NUMBERS (Jeff Dean's famous list)                     │
│  • L1 cache: 0.5 ns                                            │
│  • RAM: 100 ns                                                 │
│  • SSD: 100 μs                                                 │
│  • HDD: 10 ms                                                  │
│  • Same datacenter: 0.5 ms                                     │
│  • Cross-continent: 150 ms                                     │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Scalability

Scalability is the system’s ability to handle increased load.

Vertical vs Horizontal Scaling

Vertical Scaling

Pros: Simple, no code changesCons: Hardware limits, single point of failure, expensive

Horizontal Scaling

Pros: Unlimited scale, fault tolerant, cost-effectiveCons: Complex, stateless requirement, data consistency

Latency vs Throughput

Metric	Definition	Example
Latency	Time to complete one request	200ms response time
Throughput	Requests handled per unit time	10,000 requests/second
Bandwidth	Maximum data transfer rate	1 Gbps network

Latency Percentiles

p50 (median):  50% of requests faster than this
p95:           95% of requests faster than this
p99:           99% of requests faster than this
p99.9:         99.9% of requests faster than this

Example:
p50 = 100ms   (typical request)
p95 = 200ms   (slow request)
p99 = 500ms   (very slow request)
p99.9 = 2s    (worst case)

Availability

Availability = Uptime / (Uptime + Downtime)

The “Nines” of Availability

Availability	Downtime/Year	Downtime/Month
99% (two 9s)	3.65 days	7.3 hours
99.9% (three 9s)	8.76 hours	43.8 minutes
99.99% (four 9s)	52.6 minutes	4.38 minutes
99.999% (five 9s)	5.26 minutes	26.3 seconds

Achieving High Availability

CAP Theorem

In a distributed system, you can only guarantee 2 out of 3:

Consistency (C)

All nodes see the same data at the same time

Availability (A)

Every request gets a response (success or failure)

Partition Tolerance (P)

System works despite network partitions

Real-World Trade-offs

System	Choice	Reason
Banking	CP	Consistency is critical
Social Media	AP	Availability preferred
Shopping Cart	AP	Can merge conflicts later
Inventory	CP	Need accurate counts

ACID vs BASE

ACID (Traditional Databases)

Property	Description
Atomicity	All operations succeed or all fail
Consistency	Data is always valid
Isolation	Transactions don’t interfere
Durability	Committed data survives crashes

BASE (NoSQL Databases)

Property	Description
Basically Available	System is always accessible
Soft state	State may change over time
Eventually consistent	System will become consistent

Consistency Patterns

Strong Consistency

Every read receives the most recent write. All nodes see the same data at the same time.

# Python: Strong Consistency with Synchronous Replication
class StrongConsistencyDB:
    def __init__(self, replicas: list):
        self.replicas = replicas
        self.primary = replicas[0]
    
    def write(self, key: str, value: any) -> bool:
        """Write to primary and wait for ALL replicas to acknowledge"""
        # Write to primary
        self.primary.write(key, value)
        
        # Synchronously replicate to all secondaries
        for replica in self.replicas[1:]:
            success = replica.sync_write(key, value)  # Blocking call
            if not success:
                # Rollback on failure
                self.rollback(key)
                return False
        return True
    
    def read(self, key: str) -> any:
        """Read from primary (guaranteed latest)"""
        return self.primary.read(key)

# Usage in banking system
db = StrongConsistencyDB(replicas=[primary, replica1, replica2])
db.write("account:123:balance", 1000)  # Blocks until all replicas confirm
balance = db.read("account:123:balance")  # Always returns 1000

// JavaScript: Strong Consistency with Synchronous Replication
class StrongConsistencyDB {
  constructor(replicas) {
    this.replicas = replicas;
    this.primary = replicas[0];
  }

  async write(key, value) {
    // Write to primary first
    await this.primary.write(key, value);
    
    // Wait for ALL replicas to acknowledge (strong consistency)
    const replicationPromises = this.replicas.slice(1).map(
      replica => replica.syncWrite(key, value)
    );
    
    try {
      await Promise.all(replicationPromises);  // Wait for all
      return true;
    } catch (error) {
      await this.rollback(key);
      throw new Error('Replication failed: ' + error.message);
    }
  }

  async read(key) {
    // Always read from primary for guaranteed consistency
    return await this.primary.read(key);
  }
}

// Usage
const db = new StrongConsistencyDB([primary, replica1, replica2]);
await db.write('account:123:balance', 1000);
const balance = await db.read('account:123:balance');  // Always 1000

Eventual Consistency

Reads might return stale data, but eventually all nodes will have the same data.

# Python: Eventual Consistency with Async Replication
import asyncio
from datetime import datetime

class EventualConsistencyDB:
    def __init__(self, replicas: list):
        self.replicas = replicas
        self.replication_queue = asyncio.Queue()
    
    async def write(self, key: str, value: any) -> bool:
        """Write to local node immediately, replicate asynchronously"""
        # Write to local node (fast!)
        timestamp = datetime.utcnow()
        self.local_node.write(key, value, timestamp)
        
        # Queue async replication (non-blocking)
        await self.replication_queue.put({
            'key': key,
            'value': value,
            'timestamp': timestamp
        })
        
        return True  # Returns immediately!
    
    async def replicate_worker(self):
        """Background worker that replicates to other nodes"""
        while True:
            item = await self.replication_queue.get()
            for replica in self.replicas:
                try:
                    await replica.async_write(
                        item['key'], 
                        item['value'], 
                        item['timestamp']
                    )
                except Exception as e:
                    # Retry later (eventual consistency)
                    await self.retry_queue.put(item)
    
    async def read(self, key: str) -> any:
        """Read from local node (might be stale!)"""
        return self.local_node.read(key)

# Usage in social media
db = EventualConsistencyDB(replicas=[node1, node2, node3])
await db.write("post:456", {"content": "Hello World!"})
# User might not see this post immediately on other nodes
# But eventually (usually within milliseconds), all nodes will have it

// JavaScript: Eventual Consistency with Async Replication
class EventualConsistencyDB {
  constructor(replicas) {
    this.replicas = replicas;
    this.replicationQueue = [];
    this.startReplicationWorker();
  }

  async write(key, value) {
    const timestamp = Date.now();
    
    // Write to local node immediately
    await this.localNode.write(key, value, timestamp);
    
    // Queue for async replication (fire and forget)
    this.replicationQueue.push({ key, value, timestamp });
    
    return true;  // Returns immediately!
  }

  startReplicationWorker() {
    setInterval(async () => {
      while (this.replicationQueue.length > 0) {
        const item = this.replicationQueue.shift();
        
        // Replicate to all nodes in background
        for (const replica of this.replicas) {
          try {
            await replica.asyncWrite(item.key, item.value, item.timestamp);
          } catch (error) {
            // Put back in queue for retry
            this.replicationQueue.push(item);
          }
        }
      }
    }, 100);  // Process every 100ms
  }

  async read(key) {
    // Read from local (might be stale)
    return await this.localNode.read(key);
  }
}

Read-Your-Writes Consistency

Users always see their own writes immediately, even if other users see stale data.

# Python: Read-Your-Writes with Session Tracking
from datetime import datetime, timedelta

class ReadYourWritesDB:
    def __init__(self, primary, replicas):
        self.primary = primary
        self.replicas = replicas
        self.user_last_write = {}  # Track when each user last wrote
    
    def write(self, user_id: str, key: str, value: any) -> bool:
        """Write to primary and track the write timestamp"""
        timestamp = datetime.utcnow()
        self.primary.write(key, value, timestamp)
        
        # Remember when this user last wrote
        self.user_last_write[user_id] = timestamp
        
        # Async replication to replicas
        self.async_replicate(key, value, timestamp)
        return True
    
    def read(self, user_id: str, key: str) -> any:
        """
        If user recently wrote, read from primary.
        Otherwise, read from replica (faster).
        """
        last_write = self.user_last_write.get(user_id)
        
        # If user wrote in last 5 seconds, use primary
        if last_write and (datetime.utcnow() - last_write) < timedelta(seconds=5):
            return self.primary.read(key)
        
        # Safe to read from replica (user hasn't written recently)
        return self.get_random_replica().read(key)

# Usage
db = ReadYourWritesDB(primary, [replica1, replica2])
db.write("user_123", "profile:user_123", {"name": "Alice"})
profile = db.read("user_123", "profile:user_123")  # Reads from PRIMARY
profile = db.read("user_456", "profile:user_123")  # Reads from REPLICA

// JavaScript: Read-Your-Writes Consistency
class ReadYourWritesDB {
  constructor(primary, replicas) {
    this.primary = primary;
    this.replicas = replicas;
    this.userLastWrite = new Map();  // userId -> timestamp
  }

  async write(userId, key, value) {
    const timestamp = Date.now();
    
    // Write to primary
    await this.primary.write(key, value, timestamp);
    
    // Track when user last wrote
    this.userLastWrite.set(userId, timestamp);
    
    // Async replication (fire and forget)
    this.asyncReplicate(key, value, timestamp);
    return true;
  }

  async read(userId, key) {
    const lastWrite = this.userLastWrite.get(userId);
    const fiveSecondsAgo = Date.now() - 5000;
    
    // If user wrote recently, read from primary
    if (lastWrite && lastWrite > fiveSecondsAgo) {
      return await this.primary.read(key);
    }
    
    // Otherwise, read from any replica (faster)
    const replica = this.replicas[Math.floor(Math.random() * this.replicas.length)];
    return await replica.read(key);
  }
}

// User always sees their own updates immediately
const db = new ReadYourWritesDB(primary, [replica1, replica2]);
await db.write('user_123', 'profile:user_123', { name: 'Alice' });
const myProfile = await db.read('user_123', 'profile:user_123');  // From PRIMARY
const theirProfile = await db.read('user_456', 'profile:user_123');  // From REPLICA

Back-of-the-Envelope Estimation

Common Calculations

# Daily Active Users (DAU) to QPS
DAU = 100_000_000  # 100 million
requests_per_user_per_day = 10
seconds_per_day = 86400

QPS = (DAU * requests_per_user_per_day) / seconds_per_day
# = 1,000,000,000 / 86,400 ≈ 11,574 QPS

# Peak QPS (2-3x average)
peak_QPS = QPS * 2.5  # ≈ 29,000 QPS

Storage Estimation

# Example: Twitter-like service
users = 500_000_000
tweets_per_user_per_day = 2
tweet_size = 280  # characters
metadata_size = 200  # bytes

daily_tweets = users * tweets_per_user_per_day
# = 1,000,000,000 tweets/day

daily_storage = daily_tweets * (tweet_size + metadata_size)
# = 1B * 480 bytes = 480 GB/day

yearly_storage = daily_storage * 365
# = 175 TB/year (just text, not including media)

Memory Estimation

# Cache sizing (80/20 rule)
# 20% of data serves 80% of requests

daily_requests = 1_000_000_000
request_size = 500  # bytes (average response)
cache_hit_ratio = 0.8

# Cache 20% of daily unique requests
cache_size = 0.2 * daily_requests * request_size
# = 100 GB of cache

Interview Tip: Don’t worry about exact numbers. Round liberally and show your reasoning. 86,400 ≈ 100,000 is fine for estimation.

Interview Questions on Fundamentals

When would you choose CP over AP?

Answer: Choose CP (Consistency over Availability) when:

Financial systems: Bank transfers, payments - incorrect balance is worse than unavailability
Inventory management: Overselling is costly (e.g., airline seats)
Booking systems: Double-booking causes real-world problems
Leader election: Only one leader should exist at a time

Key phrase: “In this case, returning wrong data is worse than returning no data.”

How do you achieve 99.99% availability?

Answer: Redundancy at every layer:

Multiple DNS providers
CDN with many edge locations
Load balancers in active-passive or active-active mode
Multiple application servers (stateless)
Database replication (primary + replicas)
Multi-region deployment
Health checks and automatic failover
Circuit breakers to prevent cascade failures

Explain eventual consistency with an example

Answer: “When you post on social media, your friend might not see it for a few seconds because the data needs to propagate across replicas. This is acceptable because:

Availability is more important than instant consistency
The delay is usually sub-second and imperceptible
The data will eventually be consistent everywhere

Compare to a bank transfer where you MUST see accurate balance immediately - that needs strong consistency.”

How do you estimate QPS quickly?

Answer: Use the “divide by 100,000” rule:

DAU × requests per day ÷ 100,000 ≈ QPS
Example: 100M DAU × 10 requests = 1B / 100,000 = 10,000 QPS
Peak = 2-3x average

For storage:

1 request = ~500 bytes → 10,000 QPS = 5 MB/second = 432 GB/day

Interview Questions Bank System Design Building Blocks

Overview

Testing & Code Quality

Crash Courses

AI Engineering

Math for ML - Understanding Linear Algebra

Probability & Statistics for ML

Math for ML - Understanding Calculus

ML Mastery

Deep Learning Mastery

NestJS Mastery

Microservices Mastery

Low Level Design

OOP Concepts

SOLID Principles

Design Patterns

LLD Case Studies

System Design (HLD)

Senior Level (L5+/Staff)

HLD Case Studies

Engineering Fundamentals

DevOps & Operations

Azure Cloud Engineering

AWS Cloud

AWS Monitoring & Observability

AWS Security Services

AWS Serverless

AWS Operations

AWS Advanced

AWS Case Studies

GCP Cloud Engineering

DevOps Tools

Database Engineering

HIPAA Compliance Mastery

Operating Systems

Linux Internals

Distributed Systems

Networking Mastery

Build Your Own X

Go Lang Mastery

C Programming

Classic Research Papers

Distributed System Tools

​Quick Reference Card

​Scalability

​Vertical vs Horizontal Scaling

Vertical Scaling

Horizontal Scaling

​Latency vs Throughput

​Latency Percentiles

​Availability

​The “Nines” of Availability

​Achieving High Availability

​CAP Theorem

Consistency (C)

Availability (A)

Partition Tolerance (P)

​Real-World Trade-offs

​ACID vs BASE

​ACID (Traditional Databases)

​BASE (NoSQL Databases)

​Consistency Patterns

​Strong Consistency

​Eventual Consistency

​Read-Your-Writes Consistency

​Back-of-the-Envelope Estimation

​Common Calculations

​Storage Estimation

​Memory Estimation

​Interview Questions on Fundamentals

Quick Reference Card

Scalability

Vertical vs Horizontal Scaling

Latency vs Throughput

Latency Percentiles

Availability

The “Nines” of Availability

Achieving High Availability

CAP Theorem

Real-World Trade-offs

ACID vs BASE

ACID (Traditional Databases)

BASE (NoSQL Databases)

Consistency Patterns

Strong Consistency

Eventual Consistency

Read-Your-Writes Consistency

Back-of-the-Envelope Estimation

Common Calculations

Storage Estimation

Memory Estimation

Interview Questions on Fundamentals