Problem Statement
Design a video streaming service like Netflix or YouTube that:
Allows users to upload and stream videos
Handles millions of concurrent viewers
Provides smooth playback globally
Supports different video qualities (adaptive streaming)
This is a hard interview problem. Focus on video processing pipeline, CDN architecture, and adaptive streaming. Don’t try to cover everything—pick 2-3 areas to go deep.
Step 1: Requirements
Functional Requirements
Core Features
Upload videos
Stream videos
Search videos
Recommendations
Playback Features
Adaptive bitrate streaming
Resume playback
Multiple device support
Subtitles/captions
Non-Functional Requirements
Low Latency : Video starts in < 2 seconds
High Availability : 99.99% uptime
Global Scale : Serve users worldwide
Storage : Handle petabytes of video content
Capacity Estimation
┌─────────────────────────────────────────────────────────────────┐
│ Netflix/YouTube Scale │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Users (Netflix-like): │
│ • 200 million subscribers │
│ • 100 million DAU (50%) │
│ • 10 million peak concurrent viewers │
│ │
│ Content: │
│ • 10,000 movies/shows (Netflix) │
│ • 500 million videos (YouTube) │
│ • Average video: 1 hour = 3 GB (1080p) │
│ │
│ Bandwidth (Peak): │
│ • 10M viewers × 5 Mbps (1080p) = 50 Tbps │
│ • That's 50 terabits per second! │
│ │
│ Storage: │
│ • 10,000 videos × 3 GB × 5 qualities = 150 TB (Netflix) │
│ • 500M videos × 100 MB average = 50 PB (YouTube) │
│ │
│ Uploads (YouTube): │
│ • 500 hours of video uploaded per minute │
│ • 500 × 60 = 30,000 hours/hour │
│ │
└─────────────────────────────────────────────────────────────────┘
Step 2: High-Level Design
Step 3: Video Upload & Processing
Upload Flow
┌─────────────────────────────────────────────────────────────────┐
│ Video Upload Pipeline │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 1. Client 2. API 3. Upload │
│ │ │ │ │
│ │ Request upload URL │ │ │
│ ├───────────────────────►│ │ │
│ │ │ │ │
│ │◄──Pre-signed S3 URL────│ │ │
│ │ │ │ │
│ │ │ │ │
│ 2. Direct upload to S3 (chunked, resumable) │ │
│ │────────────────────────────────────────────► │ │
│ │ │ │ │
│ │ │ S3 Event │ │
│ │ │◄─────────────────────│ │
│ │ │ │ │
│ │ │ │ │
│ 3. Trigger transcoding pipeline │
│ │ │ │
│ │ ┌────▼────┐ │
│ │ │ Queue │ │
│ │ └────┬────┘ │
│ │ │ │
│ │ ┌────▼────┐ │
│ │ │Transcode│ │
│ │ │ Workers │ │
│ │ └────┬────┘ │
│ │ │ │
│ │ Multiple output formats: │
│ │ • 360p (500 Kbps) │
│ │ • 480p (1 Mbps) │
│ │ • 720p (2.5 Mbps) │
│ │ • 1080p (5 Mbps) │
│ │ • 4K (15 Mbps) │
│ │ │
│ 4. Store transcoded videos + metadata │
│ │ │
│ 5. Push to CDN edge locations │
│ │ │
│ 6. Notify user: "Video ready!" │
│ │
└─────────────────────────────────────────────────────────────────┘
Video Transcoding
class TranscodingService :
"""
Video transcoding pipeline
Key concepts:
- Chunk-based processing for parallelism
- Multiple output qualities
- Generate thumbnails and preview
"""
QUALITIES = [
{ "name" : "360p" , "width" : 640 , "height" : 360 , "bitrate" : "500k" },
{ "name" : "480p" , "width" : 854 , "height" : 480 , "bitrate" : "1000k" },
{ "name" : "720p" , "width" : 1280 , "height" : 720 , "bitrate" : "2500k" },
{ "name" : "1080p" , "width" : 1920 , "height" : 1080 , "bitrate" : "5000k" },
{ "name" : "4K" , "width" : 3840 , "height" : 2160 , "bitrate" : "15000k" },
]
def process_video ( self , video_id , source_path ):
# 1. Split video into chunks (for parallel processing)
chunks = self .split_into_chunks(source_path, chunk_duration = 10 )
# 2. Transcode each chunk in parallel for each quality
for quality in self . QUALITIES :
transcoded_chunks = parallel_map(
lambda chunk : self .transcode_chunk(chunk, quality),
chunks
)
# 3. Merge chunks back together
output_path = self .merge_chunks(transcoded_chunks, quality)
# 4. Generate HLS/DASH segments
self .generate_streaming_segments(output_path, video_id, quality)
# 5. Generate thumbnails
self .generate_thumbnails(source_path, video_id)
# 6. Generate preview/trailer
self .generate_preview(source_path, video_id)
# 7. Update metadata and notify
self .mark_video_ready(video_id)
Step 4: Video Streaming (Adaptive Bitrate)
How Adaptive Streaming Works
┌─────────────────────────────────────────────────────────────────┐
│ Adaptive Bitrate Streaming │
├─────────────────────────────────────────────────────────────────┤
│ │
│ HLS (HTTP Live Streaming) / DASH │
│ │
│ Video is split into small segments (2-10 seconds each) │
│ Each segment available in multiple qualities │
│ │
│ manifest.m3u8 (Master Playlist) │
│ ├── 360p/playlist.m3u8 │
│ │ ├── segment_001.ts │
│ │ ├── segment_002.ts │
│ │ └── ... │
│ ├── 720p/playlist.m3u8 │
│ │ ├── segment_001.ts │
│ │ └── ... │
│ └── 1080p/playlist.m3u8 │
│ ├── segment_001.ts │
│ └── ... │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Player Logic │ │
│ │ │ │
│ │ 1. Download manifest │ │
│ │ 2. Start with lowest quality │ │
│ │ 3. Measure download speed │ │
│ │ 4. Adjust quality based on bandwidth: │ │
│ │ │ │
│ │ Time 0s 5s 10s 15s 20s │ │
│ │ │ │ │ │ │ │ │
│ │ 360p ████ │ │
│ │ 720p ████████████ │ │
│ │ 1080p ████████████ │ │
│ │ │ │
│ │ Bandwidth: 1Mbps → 3Mbps → 6Mbps │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
HLS Manifest Example
# Master playlist (manifest.m3u8)
#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=500000,RESOLUTION=640x360
360p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=1000000,RESOLUTION=854x480
480p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2500000,RESOLUTION=1280x720
720p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=5000000,RESOLUTION=1920x1080
1080p/playlist.m3u8
# Quality playlist (1080p/playlist.m3u8)
#EXTM3U
#EXT-X-TARGETDURATION:10
#EXT-X-MEDIA-SEQUENCE:0
#EXTINF:10.0,
segment_000.ts
#EXTINF:10.0,
segment_001.ts
#EXTINF:10.0,
segment_002.ts
...
Step 5: CDN Architecture
Multi-Tier CDN
┌─────────────────────────────────────────────────────────────────┐
│ CDN Architecture │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌───────────────┐ │
│ │ Origin │ │
│ │ (S3/Cloud) │ │
│ └───────┬───────┘ │
│ │ │
│ ┌─────────────────────┼─────────────────────┐ │
│ │ │ │ │
│ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ │
│ │ Shield │ │ Shield │ │ Shield │ │
│ │ (US) │ │ (EU) │ │ (Asia) │ │
│ └────┬────┘ └────┬────┘ └────┬────┘ │
│ │ │ │ │
│ ┌────┴────────────────┐ │ ┌──────────────┴────┐ │
│ │ │ │ │ │ │
│ ┌─▼──┐ ┌───┐ ┌───┐ ┌──▼─┐ ┌▼──┐ ┌───┐ ┌───┐ ┌──▼─┐ │
│ │Edge│ │Edge│ │Edge│ │Edge│ │Edge│ │Edge│ │Edge│ │Edge│ │
│ │NYC │ │LA │ │CHI│ │LON│ │PAR│ │BER│ │TOK│ │SYD│ │
│ └─┬──┘ └─┬─┘ └─┬─┘ └─┬──┘ └─┬─┘ └─┬─┘ └─┬─┘ └─┬──┘ │
│ │ │ │ │ │ │ │ │ │
│ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ │
│ Users Users Users Users Users Users Users Users │
│ │
│ Cache Hierarchy: │
│ 1. Edge (closest to user) - most popular content │
│ 2. Shield (regional) - moderately popular │
│ 3. Origin - all content │
│ │
│ Netflix has 1000s of Open Connect Appliances in ISPs! │
│ │
└─────────────────────────────────────────────────────────────────┘
CDN Caching Strategy
Content Type Cache Duration Location Popular movies Days-Weeks Edge + Shield New releases Hours Shield, on-demand edge Long tail On-demand Origin, cached on access Thumbnails Days All edges Manifest files Seconds-Minutes All edges
Step 6: Data Models
Database Schema
-- Videos table
CREATE TABLE videos (
id UUID PRIMARY KEY ,
user_id UUID NOT NULL ,
title VARCHAR ( 255 ) NOT NULL ,
description TEXT ,
duration INT , -- seconds
status VARCHAR ( 20 ), -- processing, ready, failed
view_count BIGINT DEFAULT 0 ,
like_count BIGINT DEFAULT 0 ,
created_at TIMESTAMP ,
published_at TIMESTAMP
);
-- Video files (multiple qualities)
CREATE TABLE video_files (
id UUID PRIMARY KEY ,
video_id UUID NOT NULL ,
quality VARCHAR ( 10 ), -- 360p, 720p, 1080p, 4K
format VARCHAR ( 10 ), -- hls, dash
storage_path VARCHAR ( 500 ),
file_size BIGINT ,
bitrate INT
);
-- Watch history (for resume & recommendations)
CREATE TABLE watch_history (
user_id UUID,
video_id UUID,
progress INT , -- seconds watched
completed BOOLEAN ,
watched_at TIMESTAMP ,
PRIMARY KEY (user_id, video_id)
);
-- View counts (distributed counter)
-- Use Redis or dedicated counter service
-- Periodically sync to main database
Step 7: Recommendation System
┌─────────────────────────────────────────────────────────────────┐
│ Recommendation Architecture │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Two approaches: │
│ │
│ 1. Content-Based Filtering │
│ ───────────────────────── │
│ "You watched action movies → recommend more action" │
│ Based on: genre, actors, director, tags │
│ │
│ 2. Collaborative Filtering │
│ ──────────────────────── │
│ "Users like you also watched X" │
│ Based on: similar users' watch history │
│ │
│ Netflix uses both (Hybrid): │
│ │
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │
│ │ Watch │ │ User │ │ Content │ │
│ │ History │───►│ Profile │───►│ Matching │ │
│ │ │ │ Vector │ │ │ │
│ └───────────────┘ └───────────────┘ └───────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌───────────────────────────────────────────────────────────┐│
│ │ ML Model (Matrix Factorization) ││
│ │ ││
│ │ User-Item Matrix → Latent Factors → Predictions ││
│ │ ││
│ └───────────────────────────────────────────────────────────┘│
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────┐│
│ │ Personalized Recommendations ││
│ │ ││
│ │ "Top Picks for You" "Because you watched X" ││
│ │ "Trending Now" "Continue Watching" ││
│ │ ││
│ └───────────────────────────────────────────────────────────┘│
│ │
└─────────────────────────────────────────────────────────────────┘
Reliability: Chaos Engineering
Key Design Decisions
Decision Choice Reasoning Streaming HLS/DASH Industry standard, adaptive bitrate Storage S3 + CDN Cost-effective, globally distributed Transcoding Parallel chunks Faster processing, scalable CDN Multi-tier Popular content at edge, long-tail at origin Encoding Multiple qualities Support all bandwidths DB PostgreSQL + Redis Metadata + counters/cache
Common Interview Questions
How do you handle live streaming?
Different from VOD:
No pre-transcoding—transcode in real-time
Use RTMP for ingest, HLS/DASH for delivery
Shorter segments (2-4 seconds vs 10 seconds)
Edge servers closer to streamer
Lower latency requirements (5-30 seconds acceptable)
How do you handle copyright/DRM?
Content ID system for upload scanning
DRM: Widevine (Chrome), FairPlay (Apple), PlayReady (Microsoft)
Encrypted video segments
License server for key delivery
Watermarking for tracking leaks
How does the view counter work at scale?
Don’t update DB on every view (too expensive)
Use Redis to batch view counts
Increment in Redis: INCR video:123:views
Background job syncs to DB every minute
Display cached count (slight lag is OK)
How do you reduce transcoding costs?
Don’t transcode to 4K if source is 720p
Two-pass encoding for better quality at same bitrate
Skip rarely watched qualities (8K)
Use hardware encoders (NVENC, QuickSync)
Spot instances for transcoding workers