Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Why Estimation Matters
Back-of-envelope estimation is the skill that separates engineers who design systems from engineers who describe systems. When you estimate that your Twitter-like system needs 300,000 QPS for timeline reads, that single number immediately tells you: a single database will not work, you need aggressive caching, and fan-out-on-write (pre-computing timelines) makes more sense than fan-out-on-read. The estimate drives the architecture. In system design interviews, you’re expected to:- Size your system - How much storage? How many servers? (This determines whether you can get away with a single database or need sharding)
- Identify bottlenecks - Where will the system break? (The estimation reveals whether your bottleneck is compute, storage, bandwidth, or connections)
- Make trade-offs - Is this worth the complexity? (If your estimate shows 100 QPS, you do not need Kafka — a simple database queue suffices)
- Validate assumptions - Does this approach even work? (If your estimate requires 500TB of RAM for caching, your caching strategy is wrong)
Don’t aim for precision. Round numbers aggressively. The goal is order of magnitude, not exact values. 86,400 seconds ≈ 100,000 is perfectly fine.
Essential Numbers to Memorize
Time & Scale
| Duration | Seconds | Rounded |
|---|---|---|
| 1 second | 1 | 1 |
| 1 minute | 60 | ~100 |
| 1 hour | 3,600 | ~4,000 |
| 1 day | 86,400 | ~100,000 |
| 1 month | 2,592,000 | ~2.5 million |
| 1 year | 31,536,000 | ~30 million |
Data Units
| Unit | Bytes | Power of 2 |
|---|---|---|
| 1 KB | 1,000 | 2^10 ≈ 1,000 |
| 1 MB | 1,000,000 | 2^20 ≈ 1 million |
| 1 GB | 10^9 | 2^30 ≈ 1 billion |
| 1 TB | 10^12 | 2^40 ≈ 1 trillion |
| 1 PB | 10^15 | 2^50 |
Latency Numbers
┌─────────────────────────────────────────────────────────────────┐
│ Latency Comparison (2024) │
├─────────────────────────────────────────────────────────────────┤
│ │
│ L1 cache reference 0.5 ns ■ │
│ L2 cache reference 7 ns ■ │
│ RAM reference 100 ns ██ │
│ SSD random read 150 µs ████████ │
│ HDD seek 10 ms ████████████ │
│ Network (same datacenter) 0.5 ms ████ │
│ Network (cross-continent) 150 ms █████████████ │
│ │
│ Rule of thumb: │
│ • Memory is ~100x faster than SSD │
│ • SSD is ~100x faster than HDD │
│ • Same DC network is ~300x faster than cross-continent │
│ │
└─────────────────────────────────────────────────────────────────┘
Availability Numbers
| Availability | Downtime/Year | Downtime/Month | Downtime/Week |
|---|---|---|---|
| 99% (two 9s) | 3.65 days | 7.3 hours | 1.68 hours |
| 99.9% (three 9s) | 8.76 hours | 43.8 min | 10.1 min |
| 99.99% (four 9s) | 52.6 min | 4.38 min | 1.01 min |
| 99.999% (five 9s) | 5.26 min | 26.3 sec | 6.05 sec |
Common Calculation Patterns
Pattern 1: QPS from Daily Active Users
┌─────────────────────────────────────────────────────────────────┐
│ DAU → QPS Calculation │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Given: │
│ • 100 million DAU (Daily Active Users) │
│ • Average user makes 10 requests per day │
│ │
│ Total requests per day: │
│ = 100M × 10 = 1 billion requests/day │
│ │
│ Average QPS (Queries Per Second): │
│ = 1B / 86,400 seconds │
│ = 1B / 100K (rounded) │
│ = 10,000 QPS │
│ │
│ Peak QPS (typically 2-3x average): │
│ = 10,000 × 2.5 = 25,000 QPS │
│ │
│ ───────────────────────────────────────────────────────────── │
│ │
│ Quick Formula: │
│ QPS ≈ DAU × requests_per_user / 100,000 │
│ Peak QPS ≈ QPS × 2.5 │
│ │
└─────────────────────────────────────────────────────────────────┘
Pattern 2: Storage Estimation
┌─────────────────────────────────────────────────────────────────┐
│ Storage Calculation │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Example: Twitter-like service │
│ │
│ Given: │
│ • 500 million users │
│ • 20% are daily active (100M DAU) │
│ • Average 2 tweets per active user per day │
│ • Tweet = 140 chars (280 bytes) + 200 bytes metadata │
│ │
│ Daily tweet storage: │
│ = 100M users × 2 tweets × 500 bytes │
│ = 200M × 500 bytes │
│ = 100 GB/day │
│ │
│ Annual storage (text only): │
│ = 100 GB × 365 │
│ = 36.5 TB/year │
│ │
│ With media (assume 10% of tweets have 2MB image): │
│ = 100M × 2 × 0.1 × 2MB = 40 TB/day │
│ = 40 TB × 365 = 14.6 PB/year │
│ │
│ ───────────────────────────────────────────────────────────── │
│ │
│ Key insight: Media dominates text storage by 100x+ │
│ │
└─────────────────────────────────────────────────────────────────┘
Pattern 3: Bandwidth Estimation
┌─────────────────────────────────────────────────────────────────┐
│ Bandwidth Calculation │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Example: Video streaming service (Netflix-like) │
│ │
│ Given: │
│ • 200 million subscribers │
│ • 50% watch something daily (100M daily viewers) │
│ • Average 2 hours of video per viewer │
│ • Video bitrate: 5 Mbps (1080p average) │
│ │
│ Peak concurrent viewers (assume 10% of daily): │
│ = 100M × 0.1 = 10 million concurrent │
│ │
│ Peak bandwidth: │
│ = 10M viewers × 5 Mbps │
│ = 50 million Mbps │
│ = 50 Tbps (Terabits per second) │
│ │
│ Daily data transfer: │
│ = 100M viewers × 2 hours × 3600 sec × 5 Mbps │
│ = 100M × 7200 × 5 Mb │
│ = 3.6 × 10^12 Mb = 3.6 Petabits = 450 PB/day │
│ │
└─────────────────────────────────────────────────────────────────┘
Pattern 4: Server Capacity
┌─────────────────────────────────────────────────────────────────┐
│ Server Capacity Estimation │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Rule of thumb for web servers: │
│ • 1 server can handle ~1000 concurrent connections │
│ • 1 server can handle ~500-1000 QPS for simple API │
│ • 1 server can handle ~100-200 QPS for complex operations │
│ │
│ Example: 50,000 QPS API service │
│ │
│ Servers needed: │
│ = 50,000 QPS / 500 QPS per server │
│ = 100 servers │
│ │
│ With 3x capacity buffer (for spikes + failures): │
│ = 100 × 3 = 300 servers │
│ │
│ ───────────────────────────────────────────────────────────── │
│ │
│ Memory sizing: │
│ • 100K concurrent users, 10KB session data each │
│ • Memory = 100K × 10KB = 1GB │
│ • Per server (assume 10 servers): 100MB each │
│ │
└─────────────────────────────────────────────────────────────────┘
Pattern 5: Cache Sizing (80/20 Rule)
┌─────────────────────────────────────────────────────────────────┐
│ Cache Sizing with 80/20 Rule │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Principle: 20% of data serves 80% of requests │
│ │
│ Example: E-commerce product catalog │
│ │
│ Given: │
│ • 10 million products │
│ • Average product data: 5 KB │
│ • Total catalog: 10M × 5KB = 50 GB │
│ │
│ Cache 20% of products: │
│ = 50 GB × 0.2 = 10 GB cache │
│ │
│ Expected cache hit rate: ~80% │
│ │
│ ───────────────────────────────────────────────────────────── │
│ │
│ Alternative: Time-based estimation │
│ │
│ • 1 million requests/day to product pages │
│ • 70% are repeat views (dedup = 300K unique) │
│ • Cache last 24 hours of views │
│ • Cache size = 300K × 5KB = 1.5 GB │
│ │
└─────────────────────────────────────────────────────────────────┘
Complete Example: URL Shortener
Let’s walk through a complete estimation for a URL shortener like bit.ly.Requirements & Assumptions
┌─────────────────────────────────────────────────────────────────┐
│ URL Shortener Estimation │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Functional: │
│ • Shorten long URLs → short URLs │
│ • Redirect short URLs → original URLs │
│ • Analytics (optional) │
│ │
│ Non-functional: │
│ • 100 million new URLs per month │
│ • 10:1 read-to-write ratio │
│ • 5 year data retention │
│ │
└─────────────────────────────────────────────────────────────────┘
Traffic Estimation
# Write traffic
new_urls_per_month = 100_000_000
new_urls_per_second = 100_000_000 / (30 * 24 * 3600)
# ≈ 100M / 2.5M = 40 URLs/second
# Read traffic (10:1 ratio)
redirects_per_second = 40 * 10 = 400 QPS
# Peak traffic (3x average)
peak_write_qps = 40 * 3 = 120 QPS
peak_read_qps = 400 * 3 = 1200 QPS
Storage Estimation
# URL data
original_url_size = 500 # bytes average
short_url_size = 7 # characters (base62)
metadata_size = 100 # bytes (created_at, user_id, etc.)
total_per_url = 500 + 7 + 100 = ~600 bytes
# Monthly storage
monthly_storage = 100_000_000 * 600 = 60 GB/month
# 5 year storage
total_storage = 60 GB * 12 * 5 = 3.6 TB
# With 2x replication
replicated_storage = 3.6 * 2 = 7.2 TB
Short URL Length
# How many URLs can we encode?
# Using base62 (a-z, A-Z, 0-9)
# 62^6 = 56 billion combinations
# 62^7 = 3.5 trillion combinations
# We need: 100M/month × 12 × 5 = 6 billion URLs
# 7 characters is sufficient (3.5 trillion >> 6 billion)
Bandwidth Estimation
# Write bandwidth
write_bandwidth = 40 * 600 bytes = 24 KB/s
# Read bandwidth
# Redirect is small response (Location header)
read_bandwidth = 400 * 200 bytes = 80 KB/s
# Minimal bandwidth, not a concern
Memory (Cache) Estimation
# Cache hot URLs for fast redirects
# 80/20 rule: 20% URLs get 80% traffic
daily_reads = 400 * 86400 = 34.5 million reads
# Assume 30% unique URLs accessed daily
unique_daily = 34.5M * 0.3 = 10 million URLs
# Cache size
cache_size = 10_000_000 * 600 bytes = 6 GB
# Redis can easily handle this
Summary
┌─────────────────────────────────────────────────────────────────┐
│ URL Shortener - Final Numbers │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Traffic: │
│ • Write: 40 QPS (peak: 120 QPS) │
│ • Read: 400 QPS (peak: 1200 QPS) │
│ │
│ Storage: │
│ • 3.6 TB over 5 years │
│ • 7.2 TB with replication │
│ │
│ Cache: │
│ • 6 GB Redis cache │
│ │
│ Key Design Decisions: │
│ • 7 character short URLs (base62) │
│ • Read-heavy → cache aggressively │
│ • Single database instance is sufficient │
│ • 2-3 application servers for redundancy │
│ │
└─────────────────────────────────────────────────────────────────┘
Complete Example: Twitter Timeline
Requirements
┌─────────────────────────────────────────────────────────────────┐
│ Twitter Timeline Estimation │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Given: │
│ • 500 million total users │
│ • 200 million DAU │
│ • Average user follows 200 people │
│ • 10% of users post daily (20M tweets/day) │
│ • Average user checks timeline 5 times/day │
│ │
└─────────────────────────────────────────────────────────────────┘
Timeline Generation QPS
# Timeline reads
timeline_reads_per_day = 200_000_000 * 5 = 1 billion/day
timeline_qps = 1_000_000_000 / 86400 ≈ 11,600 QPS
# Tweet writes
tweet_writes_per_day = 20_000_000
tweet_qps = 20_000_000 / 86400 ≈ 230 QPS
# Ratio: 50:1 (heavily read-oriented)
Fan-out Calculation
This is one of the most important estimation exercises in system design because it demonstrates how a single architectural choice (push vs pull) has dramatic implications that only become visible through the numbers. This is the exact problem Twitter (now X) faced, and their solution (hybrid fan-out) has become a canonical case study.┌─────────────────────────────────────────────────────────────────┐
│ Fan-out on Write vs Read │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Fan-out on Write (push model): │
│ ──────────────────────────── │
│ When user posts → push to all followers' pre-computed │
│ timelines in cache │
│ • 230 QPS x 200 followers = 46,000 cache writes/sec │
│ • Timeline reads are fast: just fetch from cache │
│ │
│ Problem: Celebrity with 50M followers │
│ • 1 tweet = 50 million cache writes! │
│ • Takes minutes to propagate │
│ • Wastes resources if most followers never open the app │
│ │
│ Solution: Hybrid approach │
│ • Small accounts (< 10K followers): Fan-out on write │
│ • Celebrities (> 10K followers): Fan-out on read │
│ • At read time, merge pre-computed timeline with fresh │
│ queries for celebrity tweets (just a handful of users) │
│ │
│ Fan-out on Read (pull model): │
│ ─────────────────────────── │
│ When user opens app → query all followees for recent tweets │
│ • 11,600 QPS x 200 followees = 2.3M queries/sec │
│ • Too expensive at read time! │
│ │
│ KEY INSIGHT: The estimation proves the hybrid approach │
│ is necessary. Neither pure push nor pure pull works at │
│ Twitter scale. This is what interviewers want to see -- │
│ you did the math, found the problem, and adapted. │
│ │
└─────────────────────────────────────────────────────────────────┘
Storage for Timelines
# Pre-computed timeline storage (fan-out on write)
# Store last 800 tweet IDs per user
timeline_size = 800 * 8 bytes (tweet ID) = 6.4 KB per user
total_timeline_storage = 500_000_000 * 6.4 KB = 3.2 TB
# Redis cluster with 3.2 TB RAM
# Or multiple Redis instances (10 × 320 GB)
Estimation Cheat Sheet
┌─────────────────────────────────────────────────────────────────┐
│ Quick Reference Formulas │
├─────────────────────────────────────────────────────────────────┤
│ │
│ DAU to QPS: │
│ QPS = DAU × requests_per_user / 100,000 │
│ │
│ Peak QPS: │
│ Peak = Average × 2.5 (or ×3 for social) │
│ │
│ Storage: │
│ Daily = DAU × actions × data_size │
│ Yearly = Daily × 365 │
│ │
│ Bandwidth: │
│ BW = QPS × response_size │
│ │
│ Servers: │
│ Count = QPS / QPS_per_server × 3 (buffer) │
│ │
│ Cache: │
│ Size = working_set_size × 0.2 (80/20 rule) │
│ │
│ URL length (base62): │
│ 62^n > total_items_expected │
│ │
└─────────────────────────────────────────────────────────────────┘
Interview Tips
Show your work: Write down assumptions clearly. State “assuming 100K seconds in a day” before calculating.Round aggressively: Use powers of 10. 86,400 → 100,000 is fine.Sanity check: Does the answer make sense? 1 million GB is suspicious.Ask about scale: “Are we designing for 1M or 100M users?” This changes everything.Know your powers: 2^10 ≈ 1000, 2^20 ≈ 1M, 2^30 ≈ 1B, 2^40 ≈ 1T
Capacity Planning Calculator
Use these utility classes for quick estimations in interviews or actual capacity planning:- Python
- JavaScript
from dataclasses import dataclass, field
from typing import Dict, Optional, List
from enum import Enum
import math
class DataUnit(Enum):
BYTES = 1
KB = 1024
MB = 1024 ** 2
GB = 1024 ** 3
TB = 1024 ** 4
PB = 1024 ** 5
class TimeUnit(Enum):
SECOND = 1
MINUTE = 60
HOUR = 3600
DAY = 86400
WEEK = 604800
MONTH = 2592000
YEAR = 31536000
@dataclass
class SystemEstimate:
"""Complete system capacity estimation"""
# Input parameters
total_users: int
dau_percentage: float = 0.2 # 20% DAU by default
requests_per_user_per_day: int = 10
write_to_read_ratio: float = 0.1 # 10% writes
data_per_record_bytes: int = 1000
retention_years: int = 5
# Computed values (filled by calculate())
dau: int = 0
daily_requests: int = 0
average_qps: float = 0
peak_qps: float = 0
write_qps: float = 0
read_qps: float = 0
daily_storage_bytes: int = 0
yearly_storage_bytes: int = 0
total_storage_bytes: int = 0
cache_size_bytes: int = 0
estimated_servers: int = 0
def calculate(self) -> 'SystemEstimate':
"""Calculate all derived metrics"""
# User metrics
self.dau = int(self.total_users * self.dau_percentage)
# Traffic metrics
self.daily_requests = self.dau * self.requests_per_user_per_day
self.average_qps = self.daily_requests / TimeUnit.DAY.value
self.peak_qps = self.average_qps * 3 # 3x for peak
self.write_qps = self.average_qps * self.write_to_read_ratio
self.read_qps = self.average_qps * (1 - self.write_to_read_ratio)
# Storage metrics
write_requests = self.daily_requests * self.write_to_read_ratio
self.daily_storage_bytes = int(write_requests * self.data_per_record_bytes)
self.yearly_storage_bytes = self.daily_storage_bytes * 365
self.total_storage_bytes = self.yearly_storage_bytes * self.retention_years
# Cache (20% of hot data - 80/20 rule)
self.cache_size_bytes = int(self.total_storage_bytes * 0.2)
# Server estimation (500 QPS per server, 3x buffer)
self.estimated_servers = max(3, int((self.peak_qps / 500) * 3))
return self
def format_bytes(self, bytes: int) -> str:
"""Convert bytes to human-readable format"""
for unit in ['B', 'KB', 'MB', 'GB', 'TB', 'PB']:
if abs(bytes) < 1024:
return f"{bytes:.1f} {unit}"
bytes /= 1024
return f"{bytes:.1f} PB"
def summary(self) -> str:
"""Generate readable summary"""
return f"""
╔══════════════════════════════════════════════════════════╗
║ SYSTEM CAPACITY ESTIMATE ║
╠══════════════════════════════════════════════════════════╣
║ TRAFFIC ║
║ Total Users: {self.total_users:,}
║ Daily Active Users: {self.dau:,}
║ Daily Requests: {self.daily_requests:,}
║ Average QPS: {self.average_qps:.1f}
║ Peak QPS: {self.peak_qps:.1f}
║ Write QPS: {self.write_qps:.1f}
║ Read QPS: {self.read_qps:.1f}
╠══════════════════════════════════════════════════════════╣
║ STORAGE ║
║ Daily Storage: {self.format_bytes(self.daily_storage_bytes)}
║ Yearly Storage: {self.format_bytes(self.yearly_storage_bytes)}
║ Total ({self.retention_years} years): {self.format_bytes(self.total_storage_bytes)}
║ Cache Size (20%): {self.format_bytes(self.cache_size_bytes)}
╠══════════════════════════════════════════════════════════╣
║ INFRASTRUCTURE ║
║ Estimated Servers: {self.estimated_servers}
╚══════════════════════════════════════════════════════════╝
"""
# ============== URL Shortener Calculator ==============
@dataclass
class URLShortenerEstimate:
"""Specialized calculator for URL shortener systems"""
new_urls_per_month: int = 100_000_000
read_write_ratio: int = 10
retention_years: int = 5
average_url_length: int = 500
def calculate(self):
# Traffic
self.write_qps = self.new_urls_per_month / TimeUnit.MONTH.value
self.read_qps = self.write_qps * self.read_write_ratio
self.peak_write_qps = self.write_qps * 3
self.peak_read_qps = self.read_qps * 3
# Storage
self.record_size = self.average_url_length + 7 + 100 # URL + short + meta
self.monthly_storage = self.new_urls_per_month * self.record_size
self.yearly_storage = self.monthly_storage * 12
self.total_storage = self.yearly_storage * self.retention_years
# Short URL length calculation
total_urls = self.new_urls_per_month * 12 * self.retention_years
self.short_url_length = math.ceil(math.log(total_urls * 10, 62))
# Cache (hot URLs)
self.cache_size = int(self.total_storage * 0.05) # 5% is hot
return self
# ============== Video Streaming Calculator ==============
@dataclass
class VideoStreamingEstimate:
"""Calculator for video streaming services"""
total_subscribers: int = 200_000_000
daily_active_percentage: float = 0.5
concurrent_percentage: float = 0.1
hours_per_viewer: float = 2
bitrate_mbps: float = 5 # 1080p average
def calculate(self):
self.daily_viewers = int(self.total_subscribers * self.daily_active_percentage)
self.concurrent_viewers = int(self.daily_viewers * self.concurrent_percentage)
# Bandwidth
self.peak_bandwidth_mbps = self.concurrent_viewers * self.bitrate_mbps
self.peak_bandwidth_tbps = self.peak_bandwidth_mbps / 1_000_000
# Daily data transfer
seconds_watched = self.daily_viewers * self.hours_per_viewer * 3600
bits_transferred = seconds_watched * self.bitrate_mbps * 1_000_000
self.daily_data_pb = bits_transferred / 8 / (1024 ** 5)
# CDN edge servers needed (assuming 10Gbps per server)
self.edge_servers = int(self.peak_bandwidth_mbps / 10000 * 2) # 2x buffer
return self
# ============== Social Media Calculator ==============
@dataclass
class SocialMediaEstimate:
"""Calculator for social media platforms (Twitter-like)"""
total_users: int = 500_000_000
dau: int = 200_000_000
avg_following: int = 200
posts_per_active_user_per_day: float = 0.1
timeline_checks_per_day: int = 5
def calculate(self):
# Post metrics
daily_posts = self.dau * self.posts_per_active_user_per_day
self.post_qps = daily_posts / TimeUnit.DAY.value
# Timeline read metrics
timeline_reads = self.dau * self.timeline_checks_per_day
self.timeline_qps = timeline_reads / TimeUnit.DAY.value
# Fan-out analysis
self.fanout_writes_per_post = self.avg_following
self.total_fanout_qps = self.post_qps * self.fanout_writes_per_post
# Timeline storage (800 tweet IDs per user)
timeline_size_bytes = 800 * 8 # 8 bytes per ID
self.timeline_cache_tb = (self.total_users * timeline_size_bytes) / (1024 ** 4)
return self
# ============== Usage Examples ==============
if __name__ == "__main__":
# E-commerce platform
ecommerce = SystemEstimate(
total_users=50_000_000,
dau_percentage=0.1,
requests_per_user_per_day=20,
write_to_read_ratio=0.05,
data_per_record_bytes=2000,
retention_years=3
).calculate()
print(ecommerce.summary())
# URL Shortener
url_shortener = URLShortenerEstimate(
new_urls_per_month=100_000_000,
read_write_ratio=10
).calculate()
print(f"URL Shortener:")
print(f" Write QPS: {url_shortener.write_qps:.1f}")
print(f" Read QPS: {url_shortener.read_qps:.1f}")
print(f" Short URL Length: {url_shortener.short_url_length}")
# Video Streaming
streaming = VideoStreamingEstimate(
total_subscribers=200_000_000
).calculate()
print(f"\nVideo Streaming:")
print(f" Peak Bandwidth: {streaming.peak_bandwidth_tbps:.1f} Tbps")
print(f" Daily Data: {streaming.daily_data_pb:.1f} PB")
print(f" Edge Servers Needed: {streaming.edge_servers}")
// ============== Data Units ==============
const DataUnit = {
BYTES: 1,
KB: 1024,
MB: 1024 ** 2,
GB: 1024 ** 3,
TB: 1024 ** 4,
PB: 1024 ** 5
};
const TimeUnit = {
SECOND: 1,
MINUTE: 60,
HOUR: 3600,
DAY: 86400,
WEEK: 604800,
MONTH: 2592000,
YEAR: 31536000
};
// ============== System Capacity Calculator ==============
class SystemEstimate {
constructor({
totalUsers,
dauPercentage = 0.2,
requestsPerUserPerDay = 10,
writeToReadRatio = 0.1,
dataPerRecordBytes = 1000,
retentionYears = 5
}) {
this.totalUsers = totalUsers;
this.dauPercentage = dauPercentage;
this.requestsPerUserPerDay = requestsPerUserPerDay;
this.writeToReadRatio = writeToReadRatio;
this.dataPerRecordBytes = dataPerRecordBytes;
this.retentionYears = retentionYears;
}
calculate() {
// User metrics
this.dau = Math.floor(this.totalUsers * this.dauPercentage);
// Traffic metrics
this.dailyRequests = this.dau * this.requestsPerUserPerDay;
this.averageQps = this.dailyRequests / TimeUnit.DAY;
this.peakQps = this.averageQps * 3;
this.writeQps = this.averageQps * this.writeToReadRatio;
this.readQps = this.averageQps * (1 - this.writeToReadRatio);
// Storage metrics
const writeRequests = this.dailyRequests * this.writeToReadRatio;
this.dailyStorageBytes = Math.floor(writeRequests * this.dataPerRecordBytes);
this.yearlyStorageBytes = this.dailyStorageBytes * 365;
this.totalStorageBytes = this.yearlyStorageBytes * this.retentionYears;
// Cache (20% hot data)
this.cacheSizeBytes = Math.floor(this.totalStorageBytes * 0.2);
// Server estimation
this.estimatedServers = Math.max(3, Math.floor((this.peakQps / 500) * 3));
return this;
}
formatBytes(bytes) {
const units = ['B', 'KB', 'MB', 'GB', 'TB', 'PB'];
let unitIndex = 0;
while (Math.abs(bytes) >= 1024 && unitIndex < units.length - 1) {
bytes /= 1024;
unitIndex++;
}
return `${bytes.toFixed(1)} ${units[unitIndex]}`;
}
summary() {
return `
╔══════════════════════════════════════════════════════════╗
║ SYSTEM CAPACITY ESTIMATE ║
╠══════════════════════════════════════════════════════════╣
║ TRAFFIC ║
║ Total Users: ${this.totalUsers.toLocaleString()}
║ Daily Active Users: ${this.dau.toLocaleString()}
║ Daily Requests: ${this.dailyRequests.toLocaleString()}
║ Average QPS: ${this.averageQps.toFixed(1)}
║ Peak QPS: ${this.peakQps.toFixed(1)}
║ Write QPS: ${this.writeQps.toFixed(1)}
║ Read QPS: ${this.readQps.toFixed(1)}
╠══════════════════════════════════════════════════════════╣
║ STORAGE ║
║ Daily Storage: ${this.formatBytes(this.dailyStorageBytes)}
║ Yearly Storage: ${this.formatBytes(this.yearlyStorageBytes)}
║ Total (${this.retentionYears} years): ${this.formatBytes(this.totalStorageBytes)}
║ Cache Size (20%): ${this.formatBytes(this.cacheSizeBytes)}
╠══════════════════════════════════════════════════════════╣
║ INFRASTRUCTURE ║
║ Estimated Servers: ${this.estimatedServers}
╚══════════════════════════════════════════════════════════╝
`;
}
}
// ============== URL Shortener Calculator ==============
class URLShortenerEstimate {
constructor({
newUrlsPerMonth = 100_000_000,
readWriteRatio = 10,
retentionYears = 5,
averageUrlLength = 500
}) {
this.newUrlsPerMonth = newUrlsPerMonth;
this.readWriteRatio = readWriteRatio;
this.retentionYears = retentionYears;
this.averageUrlLength = averageUrlLength;
}
calculate() {
// Traffic
this.writeQps = this.newUrlsPerMonth / TimeUnit.MONTH;
this.readQps = this.writeQps * this.readWriteRatio;
this.peakWriteQps = this.writeQps * 3;
this.peakReadQps = this.readQps * 3;
// Storage
this.recordSize = this.averageUrlLength + 7 + 100;
this.monthlyStorage = this.newUrlsPerMonth * this.recordSize;
this.yearlyStorage = this.monthlyStorage * 12;
this.totalStorage = this.yearlyStorage * this.retentionYears;
// Short URL length (base62)
const totalUrls = this.newUrlsPerMonth * 12 * this.retentionYears;
this.shortUrlLength = Math.ceil(Math.log(totalUrls * 10) / Math.log(62));
// Cache
this.cacheSize = Math.floor(this.totalStorage * 0.05);
return this;
}
}
// ============== Video Streaming Calculator ==============
class VideoStreamingEstimate {
constructor({
totalSubscribers = 200_000_000,
dailyActivePercentage = 0.5,
concurrentPercentage = 0.1,
hoursPerViewer = 2,
bitrateMbps = 5
}) {
this.totalSubscribers = totalSubscribers;
this.dailyActivePercentage = dailyActivePercentage;
this.concurrentPercentage = concurrentPercentage;
this.hoursPerViewer = hoursPerViewer;
this.bitrateMbps = bitrateMbps;
}
calculate() {
this.dailyViewers = Math.floor(
this.totalSubscribers * this.dailyActivePercentage
);
this.concurrentViewers = Math.floor(
this.dailyViewers * this.concurrentPercentage
);
// Bandwidth
this.peakBandwidthMbps = this.concurrentViewers * this.bitrateMbps;
this.peakBandwidthTbps = this.peakBandwidthMbps / 1_000_000;
// Daily data transfer
const secondsWatched = this.dailyViewers * this.hoursPerViewer * 3600;
const bitsTransferred = secondsWatched * this.bitrateMbps * 1_000_000;
this.dailyDataPb = bitsTransferred / 8 / (1024 ** 5);
// Edge servers (10Gbps per server, 2x buffer)
this.edgeServers = Math.floor(this.peakBandwidthMbps / 10000 * 2);
return this;
}
}
// ============== Social Media Calculator ==============
class SocialMediaEstimate {
constructor({
totalUsers = 500_000_000,
dau = 200_000_000,
avgFollowing = 200,
postsPerActiveUserPerDay = 0.1,
timelineChecksPerDay = 5
}) {
this.totalUsers = totalUsers;
this.dau = dau;
this.avgFollowing = avgFollowing;
this.postsPerActiveUserPerDay = postsPerActiveUserPerDay;
this.timelineChecksPerDay = timelineChecksPerDay;
}
calculate() {
// Post metrics
const dailyPosts = this.dau * this.postsPerActiveUserPerDay;
this.postQps = dailyPosts / TimeUnit.DAY;
// Timeline reads
const timelineReads = this.dau * this.timelineChecksPerDay;
this.timelineQps = timelineReads / TimeUnit.DAY;
// Fan-out
this.fanoutWritesPerPost = this.avgFollowing;
this.totalFanoutQps = this.postQps * this.fanoutWritesPerPost;
// Timeline storage
const timelineSizeBytes = 800 * 8; // 800 IDs, 8 bytes each
this.timelineCacheTb = (this.totalUsers * timelineSizeBytes) / (1024 ** 4);
return this;
}
}
// ============== Quick Estimation Functions ==============
const QuickEstimate = {
dauToQps(dau, requestsPerUser) {
return (dau * requestsPerUser) / TimeUnit.DAY;
},
peakQps(averageQps, multiplier = 3) {
return averageQps * multiplier;
},
storagePerYear(dailyRecords, recordSizeBytes) {
return dailyRecords * recordSizeBytes * 365;
},
serversNeeded(peakQps, qpsPerServer = 500, buffer = 3) {
return Math.max(3, Math.ceil((peakQps / qpsPerServer) * buffer));
},
cacheSize(totalStorageBytes, hotDataPercentage = 0.2) {
return Math.floor(totalStorageBytes * hotDataPercentage);
},
base62Length(totalItems) {
return Math.ceil(Math.log(totalItems * 10) / Math.log(62));
},
bandwidthMbps(qps, responseSizeBytes) {
return (qps * responseSizeBytes * 8) / 1_000_000;
}
};
// ============== Usage Examples ==============
// E-commerce platform
const ecommerce = new SystemEstimate({
totalUsers: 50_000_000,
dauPercentage: 0.1,
requestsPerUserPerDay: 20,
writeToReadRatio: 0.05,
dataPerRecordBytes: 2000,
retentionYears: 3
}).calculate();
console.log(ecommerce.summary());
// URL Shortener
const urlShortener = new URLShortenerEstimate({
newUrlsPerMonth: 100_000_000,
readWriteRatio: 10
}).calculate();
console.log('URL Shortener:');
console.log(` Write QPS: ${urlShortener.writeQps.toFixed(1)}`);
console.log(` Read QPS: ${urlShortener.readQps.toFixed(1)}`);
console.log(` Short URL Length: ${urlShortener.shortUrlLength}`);
// Video Streaming
const streaming = new VideoStreamingEstimate({
totalSubscribers: 200_000_000
}).calculate();
console.log('\nVideo Streaming:');
console.log(` Peak Bandwidth: ${streaming.peakBandwidthTbps.toFixed(1)} Tbps`);
console.log(` Daily Data: ${streaming.dailyDataPb.toFixed(1)} PB`);
console.log(` Edge Servers: ${streaming.edgeServers}`);
module.exports = {
SystemEstimate,
URLShortenerEstimate,
VideoStreamingEstimate,
SocialMediaEstimate,
QuickEstimate,
DataUnit,
TimeUnit
};