How to Use This Guide
This guide provides realistic mock interview scenarios with:- Typical interviewer questions and follow-ups
- Expected discussion points at each stage
- Common mistakes to avoid
- Evaluation criteria
Interview Format: Most system design interviews are 45-60 minutes. Aim to spend:
- 5 min: Requirements clarification
- 5 min: High-level design
- 25-30 min: Deep dive
- 5-10 min: Wrap-up and questions
Mock Interview 1: Design TinyURL
Setup
Copy
Interviewer: "Let's design a URL shortening service like TinyURL
or bit.ly. Users should be able to create short URLs and be
redirected to the original URL."
Phase 1: Requirements (5 min)
Your questions should include:Copy
✓ "What's the expected scale? How many URLs per day?"
→ "Let's say 100M new URLs per day, 10B redirects"
✓ "What's the expected URL length?"
→ "As short as possible, let's say 7 characters"
✓ "Should URLs expire?"
→ "Yes, default 1 year, customizable"
✓ "Do we need analytics?"
→ "Yes, basic click counts and geographic data"
✓ "Custom short URLs?"
→ "Nice to have, but focus on auto-generated first"
Phase 2: Capacity Estimation (3 min)
Copy
Show your math:
Write: 100M URLs/day
= 100M / 86,400 ≈ 1,200 URLs/sec
Read: 10B redirects/day
= 10B / 86,400 ≈ 115,000 redirects/sec
→ 100:1 read-heavy
Storage (5 years):
100M × 365 × 5 = 182.5B URLs
Each URL: ~500 bytes average
Total: 182.5B × 500B = ~90 TB
Bandwidth:
Write: 1,200 × 500B = 600 KB/s
Read: 115,000 × 500B = 58 MB/s
Phase 3: High-Level Design (5 min)
Copy
┌─────────────────────────────────────────────────────────┐
│ │
│ ┌────────┐ ┌────────────┐ ┌────────────┐ │
│ │ Client │─────►│ API │─────►│ Cache │ │
│ └────────┘ │ Gateway │ │ (Redis) │ │
│ └─────┬──────┘ └────────────┘ │
│ │ │ │
│ ┌─────────────┼─────────────┐ │ │
│ │ │ │ │ │
│ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ │ │
│ │ URL │ │ Redirect│ │Analytics│ │ │
│ │ Creator │ │ Service │ │ Service │ │ │
│ └────┬────┘ └────┬────┘ └────┬────┘ │ │
│ │ │ │ │ │
│ └────────────┼────────────┘ │ │
│ │ │ │
│ ┌─────▼─────┐ │ │
│ │ Database │◄──────────────┘ │
│ │ (Sharded)│ │
│ └───────────┘ │
│ │
└─────────────────────────────────────────────────────────┘
Phase 4: Deep Dive Questions
Q: “How do you generate the short URL?”Copy
Approach 1: Base62 encoding of auto-increment ID
- IDs: 1, 2, 3, ...
- Encode: 62^7 = 3.5 trillion combinations
- Pros: Simple, no collisions
- Cons: Predictable, single point (ID generator)
Approach 2: Random generation + collision check
- Generate random 7 chars
- Check database
- Retry on collision
- Cons: Extra DB call, collision probability increases
Approach 3: Pre-generated key service
- Background service generates unused keys
- Store in key pool
- URL service fetches from pool
- Pros: No collision check at write time
- Cons: Extra service, key management
My recommendation: Approach 3 for this scale
- Handle 1,200 writes/sec easily
- No collision overhead
- Keys distributed across instances
Copy
1. Caching Strategy:
- Redis/Memcached cluster
- LRU eviction
- Cache size: 20% of daily active = 20M entries
- Expected hit rate: 90%+
2. Cache key: short_url
Cache value: {long_url, created_at, expires_at}
TTL: 24 hours (refresh on access)
3. Read path:
cache_hit → redirect
cache_miss → DB lookup → cache write → redirect
Copy
Shard by short URL hash:
- Consistent hashing with virtual nodes
- shard = hash(short_url) % num_shards
- Each shard handles ~1/N of traffic
Why not by user_id?
- Redirect requests don't have user context
- Need to lookup by short URL
Replication:
- 3 replicas per shard
- Async replication (eventual consistency OK)
- Promotes replica on primary failure
Copy
Options:
1. TTL in database (native support)
2. Background cleanup job
3. Lazy deletion (check on read)
Hybrid approach:
- Check expiration on read (immediate)
- Background job for cleanup (storage)
- Tombstone records for analytics
Evaluation Criteria
| Aspect | Junior | Senior | Staff |
|---|---|---|---|
| Requirements | Asks basic questions | Identifies edge cases | Challenges assumptions |
| Scale | Basic math | Accurate estimates | Cost optimization |
| Design | Working solution | Handles failures | Extensible architecture |
| Trade-offs | Mentions one | Compares options | Quantifies impact |
Mock Interview 2: Design Twitter Timeline
Setup
Copy
Interviewer: "Let's design the home timeline feature for Twitter.
Users should see tweets from people they follow, in chronological
order or ranked by relevance."
Requirements Clarification
Copy
✓ "How many users? Active users?"
→ "300M monthly active, 150M daily active"
✓ "Average follows per user?"
→ "Average 200, some celebrities with millions"
✓ "Tweets per user per day?"
→ "Average 2 tweets/day"
✓ "Timeline page size?"
→ "20 tweets per page"
✓ "Real-time updates needed?"
→ "Yes, new tweets should appear within seconds"
✓ "Chronological or ranked?"
→ "Support both, ranked is default"
Capacity Estimation
Copy
Users: 150M DAU
Follows: 200 average
Tweets: 2/day/user
Write load:
150M × 2 = 300M tweets/day
= 3,500 tweets/second
Timeline reads:
150M users × 10 timeline refreshes/day
= 1.5B timeline reads/day
= 17,000 timeline reads/second
Fan-out consideration:
If user has 1M followers:
1 tweet → 1M timeline updates
100 tweets from celebs/day → 100M updates
High-Level Design
Copy
┌─────────────────────────────────────────────────────────────┐
│ Twitter Timeline Architecture │
├─────────────────────────────────────────────────────────────┤
│ │
│ Write Path (Tweet Creation): │
│ ┌──────┐ ┌─────────┐ ┌───────────┐ │
│ │Client│───►│Tweet Svc│───►│Tweet Store│ │
│ └──────┘ └────┬────┘ └───────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────┐ │
│ │ Fan-out Svc │ │
│ └──────┬───────┘ │
│ │ │
│ ┌─────────────┼─────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────┐ ┌──────┐ ┌──────┐ │
│ │User A│ │User B│ │User C│ Timeline Caches │
│ │Cache │ │Cache │ │Cache │ │
│ └──────┘ └──────┘ └──────┘ │
│ │
│ Read Path (Timeline Fetch): │
│ ┌──────┐ ┌───────────┐ ┌──────────┐ │
│ │Client│───►│Timeline │───►│ Cache or │ │
│ └──────┘ │Service │ │ Fan-in │ │
│ └───────────┘ └──────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Key Design Decisions
Q: “Fan-out on write vs fan-out on read?”Copy
Fan-out on Write:
- When user tweets, push to all followers' timelines
- Pre-compute timelines
- Fast reads (just fetch from cache)
- Problem: Celebrities with 50M followers
Fan-out on Read:
- When user opens timeline, fetch from followees
- Compute on demand
- Slow for users following many people
- Works well for celebrities
Hybrid Approach (Twitter's solution):
- Regular users: Fan-out on write
- Celebrities (>10K followers): Fan-out on read
- At read time: Merge cached timeline + celebrity tweets
- Best of both worlds
Copy
Ranking Signals:
├── Recency (newer = higher)
├── Engagement (likes, retweets, replies)
├── User affinity (interaction history)
├── Content type (media, links, text)
├── Author factors (verified, follower count)
└── Negative signals (muted words, reported)
Implementation:
1. Feature extraction at write time
2. Store features with tweet
3. ML model scores at read time
4. Sort by score
5. A/B test ranking changes
Copy
Celebrity Detection:
- Follower count > 10,000 (configurable)
- Flag in user table
Write Path for Celebrities:
- Don't fan out at write time
- Tweet goes to celebrity tweet store only
Read Path Merge:
1. Fetch user's cached timeline (fan-out on write)
2. Fetch celebrity tweets (users they follow)
3. Merge and rank
4. Return top N
Cache Strategy:
- Cache merged timeline briefly (1 minute)
- Invalidate on new celebrity tweet
Mock Interview 3: Design Uber
Setup
Copy
Interviewer: "Design the core ride-matching system for Uber.
Focus on matching riders with nearby drivers efficiently."
Requirements
Copy
✓ Scale: 1M active drivers, 10M rides/day
✓ Matching latency: < 30 seconds
✓ Geographic scope: Global, 600+ cities
✓ Driver location updates: Every 4 seconds
✓ ETA calculation needed
Key Challenges
Copy
1. Real-time location tracking at scale
2. Efficient nearby driver search
3. Supply-demand matching
4. ETA estimation
5. Handling peak hours
High-Level Design
Copy
┌─────────────────────────────────────────────────────────────┐
│ Uber Matching System │
├─────────────────────────────────────────────────────────────┤
│ │
│ Driver Location Updates: │
│ ┌──────────┐ ┌─────────┐ ┌───────────────┐ │
│ │ Driver │───►│ Location│───►│ Geospatial │ │
│ │ App │ │ Service │ │ Index │ │
│ └──────────┘ └─────────┘ │ (QuadTree) │ │
│ └───────┬───────┘ │
│ │ │
│ Ride Request: │ │
│ ┌──────────┐ ┌─────────┐ ┌───────▼───────┐ │
│ │ Rider │───►│ Ride │───►│ Matching │ │
│ │ App │ │ Service │ │ Engine │ │
│ └──────────┘ └─────────┘ └───────┬───────┘ │
│ │ │
│ ┌──────▼──────┐ │
│ │ Dispatch │ │
│ │ to │ │
│ │ Driver │ │
│ └─────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Deep Dive: Geospatial Indexing
Copy
Option 1: Geohash
- Encode lat/lng to string: "9q8yy"
- Prefix search for nearby: "9q8y*"
- Simple, works with regular indexes
- Edge case: nearby but different prefix
Option 2: QuadTree
- Recursively divide space into quadrants
- Dynamic depth based on density
- Efficient range queries
- Better for non-uniform distribution
Option 3: H3 (Uber's solution)
- Hexagonal hierarchical grid
- Multiple resolutions
- Consistent neighbor relationships
- Handles polar regions well
Implementation:
- H3 resolution 9: ~100m² hexagons
- Store driver_id → h3_cell mapping
- For pickup at H3 cell X:
- Find drivers in cell X
- Expand to ring-1 neighbors if needed
- Sort by ETA, filter by availability
Matching Algorithm
Copy
def match_rider_to_driver(rider_location, ride_type):
"""
Find best driver for rider request.
"""
# 1. Get nearby drivers
rider_cell = h3.geo_to_h3(
rider_location.lat,
rider_location.lng,
resolution=9
)
search_cells = [rider_cell] + h3.k_ring(rider_cell, k=3)
candidates = []
for cell in search_cells:
drivers = driver_index.get_drivers(cell)
candidates.extend(drivers)
# 2. Filter by availability and vehicle type
available = [d for d in candidates
if d.status == 'available'
and d.vehicle_type == ride_type]
# 3. Calculate ETA for each
for driver in available:
driver.eta = routing_service.get_eta(
driver.location,
rider_location
)
# 4. Score and rank
scored = []
for driver in available:
score = calculate_match_score(
driver,
rider_location,
eta=driver.eta,
driver_rating=driver.rating,
acceptance_rate=driver.acceptance_rate
)
scored.append((driver, score))
scored.sort(key=lambda x: x[1], reverse=True)
# 5. Dispatch to top candidate
best_driver = scored[0][0]
dispatch_ride(best_driver, rider_location)
return best_driver
Mock Interview 4: Design Netflix
Setup
Copy
Interviewer: "Design Netflix's video streaming service.
Focus on the playback experience and content delivery."
Key Components to Discuss
Copy
1. Content Ingestion & Processing
- Transcoding pipeline
- Multiple quality levels
- DRM protection
2. Content Delivery
- CDN architecture
- Edge caching
- Adaptive bitrate streaming
3. Recommendation System
- Personalization
- Content ranking
- Thumbnails A/B testing
4. Playback Service
- Session management
- Resume playback
- Multi-device sync
Architecture Overview
Copy
┌─────────────────────────────────────────────────────────────┐
│ Netflix Streaming Architecture │
├─────────────────────────────────────────────────────────────┤
│ │
│ Content Preparation: │
│ ┌────────┐ ┌──────────┐ ┌───────────────┐ │
│ │ Upload │───►│Transcoder│───►│ CDN Origin │ │
│ │ (4K) │ │ Pipeline │ │ (S3/GCS) │ │
│ └────────┘ └──────────┘ └───────┬───────┘ │
│ │ │
│ Playback: ▼ │
│ ┌────────┐ ┌──────────┐ ┌───────────────┐ │
│ │ Client │◄──►│ Playback │◄──►│ CDN Edge │ │
│ │ App │ │ Service │ │ (Netflix │ │
│ └────────┘ └──────────┘ │ Open │ │
│ │ │ Connect) │ │
│ │ └───────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────┐ │
│ │ Adaptive Streaming │ │
│ │ Based on bandwidth, buffer, device │ │
│ └────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Adaptive Bitrate Discussion Points
Copy
Quality Levels:
- 240p (0.3 Mbps) - Mobile on 3G
- 480p (1 Mbps) - Mobile on 4G
- 720p (3 Mbps) - Tablet/Laptop
- 1080p (5 Mbps) - TV
- 4K HDR (15 Mbps) - Premium 4K TV
Switching Logic:
- Buffer-based: Switch based on buffer fullness
- Throughput-based: Switch based on measured bandwidth
- Hybrid: Consider both + prediction
Metrics to Track:
- Rebuffering ratio
- Time to first frame
- Average bitrate delivered
- Quality switches per session
Common Mistakes to Avoid
Red Flags in System Design Interviews:
- Jumping to solution without understanding requirements
- Single point of failure in design
- Ignoring scale implications
- Over-engineering for simple problems
- Not discussing trade-offs between options
- Forgetting about failures and edge cases
- Vague hand-waving instead of concrete solutions
- Not asking clarifying questions
Interview Checklist
Copy
Before the interview:
□ Review system design patterns
□ Practice estimation (powers of 2, QPS calculations)
□ Know CAP theorem and consistency models
□ Understand caching, databases, message queues
During the interview:
□ Clarify requirements (functional + non-functional)
□ Estimate scale and capacity
□ Draw high-level design
□ Identify and address bottlenecks
□ Discuss trade-offs
□ Consider failure scenarios
□ Mention monitoring and alerting
Communication:
□ Think out loud
□ Explain your reasoning
□ Acknowledge uncertainty
□ Ask for feedback
□ Be open to hints
Practice Problems
Week 1-2: Core Systems
- URL Shortener
- Rate Limiter
- Key-Value Store
- Twitter Feed
- Facebook News Feed
- Web Crawler
- Notification System
- Task Scheduler
- Uber/Lyft
- YouTube/Netflix
- Google Search