Mock Interview Scenarios

How to Use This Guide

This guide provides realistic mock interview scenarios with:

Typical interviewer questions and follow-ups
Expected discussion points at each stage
Common mistakes to avoid
Evaluation criteria

Interview Format: Most system design interviews are 45-60 minutes. Aim to spend:

5 min: Requirements clarification
5 min: High-level design
25-30 min: Deep dive
5-10 min: Wrap-up and questions

Mock Interview 1: Design TinyURL

Setup

Interviewer: "Let's design a URL shortening service like TinyURL 
or bit.ly. Users should be able to create short URLs and be 
redirected to the original URL."

Phase 1: Requirements (5 min)

Your questions should include:

✓ "What's the expected scale? How many URLs per day?"
  → "Let's say 100M new URLs per day, 10B redirects"

✓ "What's the expected URL length?"
  → "As short as possible, let's say 7 characters"

✓ "Should URLs expire?"
  → "Yes, default 1 year, customizable"

✓ "Do we need analytics?"
  → "Yes, basic click counts and geographic data"

✓ "Custom short URLs?"
  → "Nice to have, but focus on auto-generated first"

Phase 2: Capacity Estimation (3 min)

Show your math:

Write: 100M URLs/day
     = 100M / 86,400 ≈ 1,200 URLs/sec

Read: 10B redirects/day
    = 10B / 86,400 ≈ 115,000 redirects/sec
    → 100:1 read-heavy

Storage (5 years):
    100M × 365 × 5 = 182.5B URLs
    Each URL: ~500 bytes average
    Total: 182.5B × 500B = ~90 TB

Bandwidth:
    Write: 1,200 × 500B = 600 KB/s
    Read: 115,000 × 500B = 58 MB/s

Phase 3: High-Level Design (5 min)

┌─────────────────────────────────────────────────────────┐
│                                                         │
│   ┌────────┐      ┌────────────┐      ┌────────────┐  │
│   │ Client │─────►│    API     │─────►│   Cache    │  │
│   └────────┘      │  Gateway   │      │  (Redis)   │  │
│                   └─────┬──────┘      └────────────┘  │
│                         │                    │         │
│           ┌─────────────┼─────────────┐     │         │
│           │             │             │     │         │
│      ┌────▼────┐  ┌────▼────┐  ┌────▼────┐ │         │
│      │  URL    │  │ Redirect│  │Analytics│  │         │
│      │ Creator │  │ Service │  │ Service │  │         │
│      └────┬────┘  └────┬────┘  └────┬────┘  │         │
│           │            │            │        │         │
│           └────────────┼────────────┘        │         │
│                        │                     │         │
│                  ┌─────▼─────┐               │         │
│                  │  Database │◄──────────────┘         │
│                  │  (Sharded)│                         │
│                  └───────────┘                         │
│                                                         │
└─────────────────────────────────────────────────────────┘

Phase 4: Deep Dive Questions

Q: “How do you generate the short URL?”

Approach 1: Base62 encoding of auto-increment ID
- IDs: 1, 2, 3, ...
- Encode: 62^7 = 3.5 trillion combinations
- Pros: Simple, no collisions
- Cons: Predictable, single point (ID generator)

Approach 2: Random generation + collision check
- Generate random 7 chars
- Check database
- Retry on collision
- Cons: Extra DB call, collision probability increases

Approach 3: Pre-generated key service
- Background service generates unused keys
- Store in key pool
- URL service fetches from pool
- Pros: No collision check at write time
- Cons: Extra service, key management

My recommendation: Approach 3 for this scale
- Handle 1,200 writes/sec easily
- No collision overhead
- Keys distributed across instances

Q: “How do you handle the 115K reads/second?”

1. Caching Strategy:
   - Redis/Memcached cluster
   - LRU eviction
   - Cache size: 20% of daily active = 20M entries
   - Expected hit rate: 90%+

2. Cache key: short_url
   Cache value: {long_url, created_at, expires_at}
   TTL: 24 hours (refresh on access)

3. Read path:
   cache_hit → redirect
   cache_miss → DB lookup → cache write → redirect

Q: “How do you shard the database?”

Shard by short URL hash:
- Consistent hashing with virtual nodes
- shard = hash(short_url) % num_shards
- Each shard handles ~1/N of traffic

Why not by user_id?
- Redirect requests don't have user context
- Need to lookup by short URL

Replication:
- 3 replicas per shard
- Async replication (eventual consistency OK)
- Promotes replica on primary failure

Q: “How do you handle expiration?”

Options:
1. TTL in database (native support)
2. Background cleanup job
3. Lazy deletion (check on read)

Hybrid approach:
- Check expiration on read (immediate)
- Background job for cleanup (storage)
- Tombstone records for analytics

Evaluation Criteria

Aspect	Junior	Senior	Staff
Requirements	Asks basic questions	Identifies edge cases	Challenges assumptions
Scale	Basic math	Accurate estimates	Cost optimization
Design	Working solution	Handles failures	Extensible architecture
Trade-offs	Mentions one	Compares options	Quantifies impact

Mock Interview 2: Design Twitter Timeline

Setup

Interviewer: "Let's design the home timeline feature for Twitter.
Users should see tweets from people they follow, in chronological
order or ranked by relevance."

Requirements Clarification

✓ "How many users? Active users?"
  → "300M monthly active, 150M daily active"

✓ "Average follows per user?"
  → "Average 200, some celebrities with millions"

✓ "Tweets per user per day?"
  → "Average 2 tweets/day"

✓ "Timeline page size?"
  → "20 tweets per page"

✓ "Real-time updates needed?"
  → "Yes, new tweets should appear within seconds"

✓ "Chronological or ranked?"
  → "Support both, ranked is default"

Capacity Estimation

Users: 150M DAU
Follows: 200 average
Tweets: 2/day/user

Write load:
    150M × 2 = 300M tweets/day
    = 3,500 tweets/second

Timeline reads:
    150M users × 10 timeline refreshes/day
    = 1.5B timeline reads/day
    = 17,000 timeline reads/second

Fan-out consideration:
    If user has 1M followers:
    1 tweet → 1M timeline updates
    100 tweets from celebs/day → 100M updates

High-Level Design

┌─────────────────────────────────────────────────────────────┐
│                Twitter Timeline Architecture                 │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Write Path (Tweet Creation):                               │
│  ┌──────┐    ┌─────────┐    ┌───────────┐                  │
│  │Client│───►│Tweet Svc│───►│Tweet Store│                  │
│  └──────┘    └────┬────┘    └───────────┘                  │
│                   │                                          │
│                   ▼                                          │
│            ┌──────────────┐                                  │
│            │ Fan-out Svc  │                                  │
│            └──────┬───────┘                                  │
│                   │                                          │
│     ┌─────────────┼─────────────┐                           │
│     │             │             │                            │
│     ▼             ▼             ▼                            │
│  ┌──────┐     ┌──────┐     ┌──────┐                        │
│  │User A│     │User B│     │User C│  Timeline Caches       │
│  │Cache │     │Cache │     │Cache │                        │
│  └──────┘     └──────┘     └──────┘                        │
│                                                              │
│  Read Path (Timeline Fetch):                                │
│  ┌──────┐    ┌───────────┐    ┌──────────┐                 │
│  │Client│───►│Timeline   │───►│ Cache or │                 │
│  └──────┘    │Service    │    │ Fan-in   │                 │
│              └───────────┘    └──────────┘                  │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Key Design Decisions

Q: “Fan-out on write vs fan-out on read?”

Fan-out on Write:
- When user tweets, push to all followers' timelines
- Pre-compute timelines
- Fast reads (just fetch from cache)
- Problem: Celebrities with 50M followers

Fan-out on Read:
- When user opens timeline, fetch from followees
- Compute on demand
- Slow for users following many people
- Works well for celebrities

Hybrid Approach (Twitter's solution):
- Regular users: Fan-out on write
- Celebrities (>10K followers): Fan-out on read
- At read time: Merge cached timeline + celebrity tweets
- Best of both worlds

Q: “How do you rank the timeline?”

Ranking Signals:
├── Recency (newer = higher)
├── Engagement (likes, retweets, replies)
├── User affinity (interaction history)
├── Content type (media, links, text)
├── Author factors (verified, follower count)
└── Negative signals (muted words, reported)

Implementation:
1. Feature extraction at write time
2. Store features with tweet
3. ML model scores at read time
4. Sort by score
5. A/B test ranking changes

Q: “How do you handle celebrities?”

Celebrity Detection:
- Follower count > 10,000 (configurable)
- Flag in user table

Write Path for Celebrities:
- Don't fan out at write time
- Tweet goes to celebrity tweet store only

Read Path Merge:
1. Fetch user's cached timeline (fan-out on write)
2. Fetch celebrity tweets (users they follow)
3. Merge and rank
4. Return top N

Cache Strategy:
- Cache merged timeline briefly (1 minute)
- Invalidate on new celebrity tweet

Mock Interview 3: Design Uber

Setup

Interviewer: "Design the core ride-matching system for Uber.
Focus on matching riders with nearby drivers efficiently."

Requirements

✓ Scale: 1M active drivers, 10M rides/day
✓ Matching latency: < 30 seconds
✓ Geographic scope: Global, 600+ cities
✓ Driver location updates: Every 4 seconds
✓ ETA calculation needed

Key Challenges

Real-time location tracking at scale
Efficient nearby driver search
Supply-demand matching
ETA estimation
Handling peak hours

High-Level Design

┌─────────────────────────────────────────────────────────────┐
│                  Uber Matching System                        │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Driver Location Updates:                                   │
│  ┌──────────┐    ┌─────────┐    ┌───────────────┐          │
│  │ Driver   │───►│ Location│───►│   Geospatial  │          │
│  │   App    │    │ Service │    │     Index     │          │
│  └──────────┘    └─────────┘    │   (QuadTree)  │          │
│                                 └───────┬───────┘          │
│                                         │                   │
│  Ride Request:                          │                   │
│  ┌──────────┐    ┌─────────┐    ┌───────▼───────┐          │
│  │  Rider   │───►│ Ride    │───►│   Matching    │          │
│  │   App    │    │ Service │    │    Engine     │          │
│  └──────────┘    └─────────┘    └───────┬───────┘          │
│                                         │                   │
│                                  ┌──────▼──────┐           │
│                                  │   Dispatch  │           │
│                                  │    to       │           │
│                                  │   Driver    │           │
│                                  └─────────────┘           │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Deep Dive: Geospatial Indexing

Option 1: Geohash
- Encode lat/lng to string: "9q8yy"
- Prefix search for nearby: "9q8y*"
- Simple, works with regular indexes
- Edge case: nearby but different prefix

Option 2: QuadTree
- Recursively divide space into quadrants
- Dynamic depth based on density
- Efficient range queries
- Better for non-uniform distribution

Option 3: H3 (Uber's solution)
- Hexagonal hierarchical grid
- Multiple resolutions
- Consistent neighbor relationships
- Handles polar regions well

Implementation:
- H3 resolution 9: ~100m² hexagons
- Store driver_id → h3_cell mapping
- For pickup at H3 cell X:
  - Find drivers in cell X
  - Expand to ring-1 neighbors if needed
  - Sort by ETA, filter by availability

Matching Algorithm

def match_rider_to_driver(rider_location, ride_type):
    """
    Find best driver for rider request.
    """
    # 1. Get nearby drivers
    rider_cell = h3.geo_to_h3(
        rider_location.lat, 
        rider_location.lng, 
        resolution=9
    )
    
    search_cells = [rider_cell] + h3.k_ring(rider_cell, k=3)
    
    candidates = []
    for cell in search_cells:
        drivers = driver_index.get_drivers(cell)
        candidates.extend(drivers)
    
    # 2. Filter by availability and vehicle type
    available = [d for d in candidates 
                 if d.status == 'available' 
                 and d.vehicle_type == ride_type]
    
    # 3. Calculate ETA for each
    for driver in available:
        driver.eta = routing_service.get_eta(
            driver.location, 
            rider_location
        )
    
    # 4. Score and rank
    scored = []
    for driver in available:
        score = calculate_match_score(
            driver, 
            rider_location,
            eta=driver.eta,
            driver_rating=driver.rating,
            acceptance_rate=driver.acceptance_rate
        )
        scored.append((driver, score))
    
    scored.sort(key=lambda x: x[1], reverse=True)
    
    # 5. Dispatch to top candidate
    best_driver = scored[0][0]
    dispatch_ride(best_driver, rider_location)
    
    return best_driver

Mock Interview 4: Design Netflix

Setup

Interviewer: "Design Netflix's video streaming service.
Focus on the playback experience and content delivery."

Key Components to Discuss

1. Content Ingestion & Processing
   - Transcoding pipeline
   - Multiple quality levels
   - DRM protection

2. Content Delivery
   - CDN architecture
   - Edge caching
   - Adaptive bitrate streaming

3. Recommendation System
   - Personalization
   - Content ranking
   - Thumbnails A/B testing

4. Playback Service
   - Session management
   - Resume playback
   - Multi-device sync

Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│                 Netflix Streaming Architecture               │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Content Preparation:                                       │
│  ┌────────┐    ┌──────────┐    ┌───────────────┐           │
│  │ Upload │───►│Transcoder│───►│ CDN Origin    │           │
│  │ (4K)   │    │ Pipeline │    │ (S3/GCS)      │           │
│  └────────┘    └──────────┘    └───────┬───────┘           │
│                                         │                    │
│  Playback:                              ▼                    │
│  ┌────────┐    ┌──────────┐    ┌───────────────┐           │
│  │ Client │◄──►│ Playback │◄──►│  CDN Edge     │           │
│  │  App   │    │ Service  │    │  (Netflix     │           │
│  └────────┘    └──────────┘    │   Open        │           │
│       │                        │   Connect)    │           │
│       │                        └───────────────┘           │
│       │                                                     │
│       ▼                                                     │
│  ┌────────────────────────────────────────────┐            │
│  │            Adaptive Streaming              │            │
│  │  Based on bandwidth, buffer, device        │            │
│  └────────────────────────────────────────────┘            │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Adaptive Bitrate Discussion Points

Quality Levels:
- 240p (0.3 Mbps) - Mobile on 3G
- 480p (1 Mbps) - Mobile on 4G  
- 720p (3 Mbps) - Tablet/Laptop
- 1080p (5 Mbps) - TV
- 4K HDR (15 Mbps) - Premium 4K TV

Switching Logic:
- Buffer-based: Switch based on buffer fullness
- Throughput-based: Switch based on measured bandwidth
- Hybrid: Consider both + prediction

Metrics to Track:
- Rebuffering ratio
- Time to first frame
- Average bitrate delivered
- Quality switches per session

Common Mistakes to Avoid

Red Flags in System Design Interviews:

Jumping to solution without understanding requirements
Single point of failure in design
Ignoring scale implications
Over-engineering for simple problems
Not discussing trade-offs between options
Forgetting about failures and edge cases
Vague hand-waving instead of concrete solutions
Not asking clarifying questions

Interview Checklist

Before the interview:
□ Review system design patterns
□ Practice estimation (powers of 2, QPS calculations)
□ Know CAP theorem and consistency models
□ Understand caching, databases, message queues

During the interview:
□ Clarify requirements (functional + non-functional)
□ Estimate scale and capacity
□ Draw high-level design
□ Identify and address bottlenecks
□ Discuss trade-offs
□ Consider failure scenarios
□ Mention monitoring and alerting

Communication:
□ Think out loud
□ Explain your reasoning
□ Acknowledge uncertainty
□ Ask for feedback
□ Be open to hints

Practice Problems

Week 1-2: Core Systems

URL Shortener
Rate Limiter
Key-Value Store

Week 3-4: Social Systems

Twitter Feed
Facebook News Feed
Instagram

Week 5-6: Infrastructure

Web Crawler
Notification System
Task Scheduler

Week 7-8: Complex Systems

Uber/Lyft
YouTube/Netflix
Google Search

Overview

Testing & Code Quality

Crash Courses

AI Engineering

Math for ML - Understanding Linear Algebra

Probability & Statistics for ML

Math for ML - Understanding Calculus

ML Mastery

Deep Learning Mastery

NestJS Mastery

Microservices Mastery

Low Level Design

OOP Concepts

SOLID Principles

Design Patterns

LLD Case Studies

System Design (HLD)

Senior Level (L5+/Staff)

HLD Case Studies

Engineering Fundamentals

DevOps & Operations

Azure Cloud Engineering

AWS Cloud

AWS Monitoring & Observability

AWS Security Services

AWS Serverless

AWS Operations

AWS Advanced

AWS Case Studies

GCP Cloud Engineering

DevOps Tools

Database Engineering

HIPAA Compliance Mastery

Operating Systems

Linux Internals

Distributed Systems

Networking Mastery

Build Your Own X

Go Lang Mastery

C Programming

Classic Research Papers

Distributed System Tools

​How to Use This Guide

​Mock Interview 1: Design TinyURL

​Setup

​Phase 1: Requirements (5 min)

​Phase 2: Capacity Estimation (3 min)

​Phase 3: High-Level Design (5 min)

​Phase 4: Deep Dive Questions

​Evaluation Criteria

​Mock Interview 2: Design Twitter Timeline

​Setup

​Requirements Clarification

​Capacity Estimation

​High-Level Design

​Key Design Decisions

​Mock Interview 3: Design Uber

​Setup

​Requirements

​Key Challenges

​High-Level Design

​Deep Dive: Geospatial Indexing

​Matching Algorithm

​Mock Interview 4: Design Netflix

​Setup

​Key Components to Discuss

​Architecture Overview

​Adaptive Bitrate Discussion Points

​Common Mistakes to Avoid

​Interview Checklist

​Practice Problems

How to Use This Guide

Mock Interview 1: Design TinyURL

Setup

Phase 1: Requirements (5 min)

Phase 2: Capacity Estimation (3 min)

Phase 3: High-Level Design (5 min)

Phase 4: Deep Dive Questions

Evaluation Criteria

Mock Interview 2: Design Twitter Timeline

Setup

Requirements Clarification

Capacity Estimation

High-Level Design

Key Design Decisions

Mock Interview 3: Design Uber

Setup

Requirements

Key Challenges

High-Level Design

Deep Dive: Geospatial Indexing

Matching Algorithm

Mock Interview 4: Design Netflix

Setup

Key Components to Discuss

Architecture Overview

Adaptive Bitrate Discussion Points

Common Mistakes to Avoid

Interview Checklist

Practice Problems