Problem Statement
Design a music streaming service like Spotify that handles:- Music catalog with millions of songs
- Real-time audio streaming
- Personalized playlists and recommendations
- Search and discovery
- Offline playback
- Social features (following, sharing)
Requirements
Functional Requirements
Copy
Core Features:
├── Music Playback
│ ├── Stream songs in real-time
│ ├── Queue management
│ ├── Shuffle and repeat
│ ├── Offline downloads
│ └── Cross-device sync
│
├── Content Discovery
│ ├── Search (songs, artists, albums, playlists)
│ ├── Browse by genre/mood
│ ├── Personalized recommendations
│ └── Radio stations
│
├── Library Management
│ ├── Save songs/albums/playlists
│ ├── Create playlists
│ ├── Recently played
│ └── Listening history
│
└── Social Features
├── Follow artists/users
├── Collaborative playlists
└── Share music
Non-Functional Requirements
Copy
Scale:
├── 400M monthly active users
├── 180M premium subscribers
├── 80M songs in catalog
├── 4B playlists
└── 1M concurrent streams
Performance:
├── Playback start < 200ms
├── Search results < 100ms
├── No buffering during playback
└── Seamless song transitions
Availability:
├── 99.99% uptime for streaming
├── Graceful degradation
└── Offline playback support
Capacity Estimation
Copy
Storage:
├── Songs: 80M × 5MB (avg) = 400TB audio files
├── Multiple quality levels: 400TB × 4 = 1.6PB
├── Metadata: 80M × 10KB = 800GB
├── User data: 400M × 50KB = 20TB
└── Playlists: 4B × 1KB = 4TB
Bandwidth:
├── Concurrent streams: 1M
├── Bitrate: 160kbps (average)
├── Bandwidth: 1M × 160kbps = 160Gbps
│
├── Daily streams: 400M users × 2 hours × 60min/15 songs
├── = 3.2B song plays per day
└── = 37,000 song plays per second
High-Level Architecture
Copy
┌─────────────────────────────────────────────────────────────────┐
│ Spotify Architecture │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────────────────────────────┐ │
│ │ │ │ API Gateway │ │
│ │ Client │───►│ + Authentication + Rate Limit │ │
│ │ (Mobile/Web) │ └─────────────────┬────────────────────┘ │
│ │ │ │ │
│ └──────┬───────┘ │ │
│ │ │ │
│ │ Audio Stream │ API Calls │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────────────────────────────┐ │
│ │ CDN │ │ Microservices │ │
│ │ (Audio │ │ ┌─────────┐ ┌─────────┐ ┌────────┐ │ │
│ │ Delivery) │ │ │Playback │ │ Search │ │Playlist│ │ │
│ └──────────────┘ │ │ Service │ │ Service │ │Service │ │ │
│ │ └────┬────┘ └────┬────┘ └───┬────┘ │ │
│ │ │ │ │ │ │
│ │ ┌────▼───────────▼──────────▼────┐ │ │
│ │ │ Message Queue │ │ │
│ │ │ (Kafka) │ │ │
│ │ └────────────────┬───────────────┘ │ │
│ └───────────────────┼───────────────────┘ │
│ │ │
│ ┌───────────────────────────────────────┼───────────────────┐ │
│ │ Data Layer │ │ │
│ │ ┌──────────┐ ┌──────────┐ ┌───────▼────┐ │ │
│ │ │ Song │ │ User │ │Recommendation│ │ │
│ │ │ Catalog │ │ Data │ │ Engine │ │ │
│ │ │(Cassandra│ │(Postgres)│ │ (ML) │ │ │
│ │ └──────────┘ └──────────┘ └──────────────┘ │ │
│ └───────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Core Components
1. Audio Ingestion Pipeline
Copy
┌─────────────────────────────────────────────────────────────┐
│ Audio Ingestion Pipeline │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────┐ │
│ │ Original │ FLAC/WAV (lossless) │
│ │ Master │ │
│ └─────┬──────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Transcoding Pipeline │ │
│ ├────────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ Input → Normalize → Encode → Segment → Store │ │
│ │ │ │
│ │ Output formats: │ │
│ │ ├── OGG Vorbis 96kbps (mobile, low quality) │ │
│ │ ├── OGG Vorbis 160kbps (mobile, normal) │ │
│ │ ├── OGG Vorbis 320kbps (desktop, high) │ │
│ │ └── AAC 256kbps (web player) │ │
│ │ │ │
│ └────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Storage │ │
│ │ ├── Audio files → Object Storage (S3) │ │
│ │ ├── Metadata → Cassandra │ │
│ │ ├── Waveforms → Redis (for visualization) │ │
│ │ └── Lyrics → Elasticsearch │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
2. Audio Streaming Service
- Python
- JavaScript
Copy
from dataclasses import dataclass
from typing import Optional, List, Generator
from enum import Enum
import asyncio
class AudioQuality(Enum):
LOW = "96"
NORMAL = "160"
HIGH = "320"
VERY_HIGH = "320" # Lossless for premium
@dataclass
class AudioSegment:
song_id: str
segment_index: int
start_time_ms: int
duration_ms: int
url: str
@dataclass
class PlaybackSession:
session_id: str
user_id: str
device_id: str
song_id: str
quality: AudioQuality
position_ms: int
is_playing: bool
class AudioStreamingService:
"""
Handles audio streaming with adaptive bitrate.
"""
SEGMENT_DURATION_MS = 10_000 # 10 seconds per segment
PREFETCH_SEGMENTS = 3
def __init__(
self,
audio_storage,
cdn_service,
session_store,
analytics
):
self.audio_storage = audio_storage
self.cdn = cdn_service
self.sessions = session_store
self.analytics = analytics
async def start_playback(
self,
user_id: str,
device_id: str,
song_id: str,
quality: AudioQuality = AudioQuality.NORMAL,
start_position_ms: int = 0
) -> PlaybackSession:
"""
Initialize a new playback session.
"""
# Create session
session = PlaybackSession(
session_id=generate_session_id(),
user_id=user_id,
device_id=device_id,
song_id=song_id,
quality=quality,
position_ms=start_position_ms,
is_playing=True
)
# Store session
await self.sessions.save(session)
# Get song metadata
song = await self.audio_storage.get_song_metadata(song_id)
# Generate CDN URLs for initial segments
start_segment = start_position_ms // self.SEGMENT_DURATION_MS
segments = await self._get_segment_urls(
song_id,
quality,
start_segment,
self.PREFETCH_SEGMENTS
)
# Track playback start
await self.analytics.track_play_start(
user_id=user_id,
song_id=song_id,
device_id=device_id,
quality=quality.value
)
return {
"session": session,
"song": song,
"segments": segments,
"total_segments": song["duration_ms"] // self.SEGMENT_DURATION_MS + 1
}
async def get_next_segments(
self,
session_id: str,
current_segment: int
) -> List[AudioSegment]:
"""
Get next segments for continuous playback.
Called by client as they approach end of buffered content.
"""
session = await self.sessions.get(session_id)
if not session:
raise SessionNotFoundError(session_id)
return await self._get_segment_urls(
session.song_id,
session.quality,
current_segment + 1,
self.PREFETCH_SEGMENTS
)
async def _get_segment_urls(
self,
song_id: str,
quality: AudioQuality,
start_segment: int,
count: int
) -> List[AudioSegment]:
"""
Generate signed CDN URLs for audio segments.
"""
segments = []
for i in range(count):
segment_index = start_segment + i
# Generate signed URL (expires in 1 hour)
path = f"audio/{song_id}/{quality.value}/{segment_index}.ogg"
signed_url = await self.cdn.get_signed_url(
path,
expires_in=3600
)
segments.append(AudioSegment(
song_id=song_id,
segment_index=segment_index,
start_time_ms=segment_index * self.SEGMENT_DURATION_MS,
duration_ms=self.SEGMENT_DURATION_MS,
url=signed_url
))
return segments
async def update_position(
self,
session_id: str,
position_ms: int
):
"""
Update playback position for resume and analytics.
"""
session = await self.sessions.get(session_id)
if session:
session.position_ms = position_ms
await self.sessions.save(session)
async def handle_quality_switch(
self,
session_id: str,
new_quality: AudioQuality,
current_segment: int
) -> List[AudioSegment]:
"""
Handle adaptive bitrate switch.
"""
session = await self.sessions.get(session_id)
session.quality = new_quality
await self.sessions.save(session)
# Return new quality segments
return await self._get_segment_urls(
session.song_id,
new_quality,
current_segment,
self.PREFETCH_SEGMENTS
)
async def sync_playback_state(
self,
user_id: str,
device_id: str
) -> Optional[PlaybackSession]:
"""
Get current playback state for cross-device sync.
"""
# Find active session for user
active_session = await self.sessions.get_active_for_user(user_id)
if active_session and active_session.device_id != device_id:
# Transfer playback to this device
return await self.transfer_playback(
active_session.session_id,
device_id
)
return active_session
Copy
const AudioQuality = {
LOW: '96',
NORMAL: '160',
HIGH: '320',
VERY_HIGH: '320'
};
class AudioStreamingService {
static SEGMENT_DURATION_MS = 10_000;
static PREFETCH_SEGMENTS = 3;
constructor(audioStorage, cdnService, sessionStore, analytics) {
this.audioStorage = audioStorage;
this.cdn = cdnService;
this.sessions = sessionStore;
this.analytics = analytics;
}
async startPlayback({
userId,
deviceId,
songId,
quality = AudioQuality.NORMAL,
startPositionMs = 0
}) {
const session = {
sessionId: generateSessionId(),
userId,
deviceId,
songId,
quality,
positionMs: startPositionMs,
isPlaying: true
};
await this.sessions.save(session);
const song = await this.audioStorage.getSongMetadata(songId);
const startSegment = Math.floor(
startPositionMs / AudioStreamingService.SEGMENT_DURATION_MS
);
const segments = await this.getSegmentUrls(
songId,
quality,
startSegment,
AudioStreamingService.PREFETCH_SEGMENTS
);
await this.analytics.trackPlayStart({
userId,
songId,
deviceId,
quality
});
return {
session,
song,
segments,
totalSegments: Math.ceil(
song.durationMs / AudioStreamingService.SEGMENT_DURATION_MS
)
};
}
async getNextSegments(sessionId, currentSegment) {
const session = await this.sessions.get(sessionId);
if (!session) {
throw new SessionNotFoundError(sessionId);
}
return this.getSegmentUrls(
session.songId,
session.quality,
currentSegment + 1,
AudioStreamingService.PREFETCH_SEGMENTS
);
}
async getSegmentUrls(songId, quality, startSegment, count) {
const segments = [];
for (let i = 0; i < count; i++) {
const segmentIndex = startSegment + i;
const path = `audio/${songId}/${quality}/${segmentIndex}.ogg`;
const signedUrl = await this.cdn.getSignedUrl(path, {
expiresIn: 3600
});
segments.push({
songId,
segmentIndex,
startTimeMs: segmentIndex * AudioStreamingService.SEGMENT_DURATION_MS,
durationMs: AudioStreamingService.SEGMENT_DURATION_MS,
url: signedUrl
});
}
return segments;
}
async handleQualitySwitch(sessionId, newQuality, currentSegment) {
const session = await this.sessions.get(sessionId);
session.quality = newQuality;
await this.sessions.save(session);
return this.getSegmentUrls(
session.songId,
newQuality,
currentSegment,
AudioStreamingService.PREFETCH_SEGMENTS
);
}
}
module.exports = { AudioStreamingService, AudioQuality };
3. Recommendation Engine
Copy
┌─────────────────────────────────────────────────────────────┐
│ Recommendation Architecture │
├─────────────────────────────────────────────────────────────┤
│ │
│ Data Sources: │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ • Listening history (plays, skips, duration) │ │
│ │ • Saved library (likes, playlists) │ │
│ │ • Audio features (tempo, energy, danceability) │ │
│ │ • Social signals (follows, collaborative playlists) │ │
│ │ • Context (time, device, location) │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
│ Recommendation Types: │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ 1. Collaborative Filtering │ │
│ │ "Users like you also listened to..." │ │
│ │ │ │
│ │ 2. Content-Based │ │
│ │ "Because you listened to [song]..." │ │
│ │ │ │
│ │ 3. Audio Analysis │ │
│ │ Similar tempo, key, energy level │ │
│ │ │ │
│ │ 4. Context-Aware │ │
│ │ Morning workout → upbeat songs │ │
│ │ Late night → chill music │ │
│ │ │ │
│ │ 5. Exploration vs Exploitation │ │
│ │ 80% familiar, 20% discovery │ │
│ │ │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
│ Daily Mix Generation: │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ 1. Cluster user's listening into ~6 taste groups │ │
│ │ 2. For each cluster: │ │
│ │ a. Select seed tracks (recently played) │ │
│ │ b. Find similar songs │ │
│ │ c. Add discovery songs (20%) │ │
│ │ d. Order for flow (tempo, energy arc) │ │
│ │ 3. Regenerate daily │ │
│ │ │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
- Python
- JavaScript
Copy
from dataclasses import dataclass
from typing import List, Dict, Tuple
import numpy as np
from sklearn.cluster import KMeans
@dataclass
class Song:
id: str
name: str
artist_id: str
audio_features: Dict[str, float] # tempo, energy, danceability, etc.
@dataclass
class UserTaste:
cluster_id: int
seed_tracks: List[str]
feature_centroid: Dict[str, float]
class RecommendationEngine:
"""
Multi-strategy recommendation engine.
"""
def __init__(
self,
song_vectors, # Pre-computed song embeddings
user_vectors, # Pre-computed user embeddings
audio_features, # Song audio features
listening_history, # User listening data
social_graph # Follow relationships
):
self.song_vectors = song_vectors
self.user_vectors = user_vectors
self.audio_features = audio_features
self.history = listening_history
self.social = social_graph
async def get_personalized_recommendations(
self,
user_id: str,
context: Dict = None,
limit: int = 50
) -> List[Song]:
"""
Get personalized recommendations combining multiple signals.
"""
# Get user's recent listening
recent = await self.history.get_recent_plays(user_id, days=30)
# Get candidates from different sources
candidates = []
# 1. Collaborative filtering (40% weight)
cf_candidates = await self._collaborative_filtering(user_id, limit=100)
candidates.extend([(s, 0.4) for s in cf_candidates])
# 2. Content-based from recent plays (30% weight)
for song_id in recent[:10]:
similar = await self._content_based_similar(song_id, limit=20)
candidates.extend([(s, 0.3) for s in similar])
# 3. Audio feature matching (20% weight)
taste_profile = await self._get_audio_taste_profile(user_id)
audio_similar = await self._audio_feature_matching(
taste_profile,
limit=50
)
candidates.extend([(s, 0.2) for s in audio_similar])
# 4. Discovery/exploration (10% weight)
discovery = await self._get_discovery_tracks(user_id, limit=20)
candidates.extend([(s, 0.1) for s in discovery])
# Aggregate scores
song_scores = {}
for song, weight in candidates:
if song.id not in song_scores:
song_scores[song.id] = {"song": song, "score": 0}
song_scores[song.id]["score"] += weight
# Apply context modifiers
if context:
song_scores = self._apply_context_boost(song_scores, context)
# Filter already played
played_ids = set(recent)
final = [
item for item in song_scores.values()
if item["song"].id not in played_ids
]
# Sort by score and return top N
final.sort(key=lambda x: x["score"], reverse=True)
return [item["song"] for item in final[:limit]]
async def _collaborative_filtering(
self,
user_id: str,
limit: int
) -> List[Song]:
"""
Find songs that similar users liked.
Using pre-computed user embeddings.
"""
# Get user's embedding
user_vector = await self.user_vectors.get(user_id)
# Find similar users
similar_users = await self.user_vectors.find_similar(
user_vector,
k=50
)
# Get songs they liked that this user hasn't heard
user_songs = set(await self.history.get_all_plays(user_id))
candidates = []
for similar_user_id, similarity in similar_users:
their_songs = await self.history.get_recent_plays(
similar_user_id,
days=30
)
for song_id in their_songs:
if song_id not in user_songs:
song = await self.audio_features.get_song(song_id)
candidates.append((song, similarity))
# Sort by weighted score
candidates.sort(key=lambda x: x[1], reverse=True)
return [song for song, _ in candidates[:limit]]
async def _content_based_similar(
self,
song_id: str,
limit: int
) -> List[Song]:
"""
Find songs similar to a given song.
Using pre-computed song embeddings.
"""
song_vector = await self.song_vectors.get(song_id)
similar = await self.song_vectors.find_similar(song_vector, k=limit)
return [
await self.audio_features.get_song(s_id)
for s_id, _ in similar
]
async def generate_daily_mix(
self,
user_id: str,
mix_index: int # 1-6
) -> List[Song]:
"""
Generate personalized daily mix playlist.
Each mix focuses on a different taste cluster.
"""
# Get user's taste clusters
clusters = await self._cluster_user_taste(user_id, n_clusters=6)
if mix_index > len(clusters):
return []
cluster = clusters[mix_index - 1]
# Get seed tracks from this cluster
seeds = cluster.seed_tracks[:5]
playlist = []
# 60% similar to seeds
for seed_id in seeds:
similar = await self._content_based_similar(seed_id, limit=5)
playlist.extend(similar[:3])
# 20% matching audio features
audio_matches = await self._audio_feature_matching(
cluster.feature_centroid,
limit=10
)
playlist.extend(audio_matches)
# 20% discovery (new artists in similar space)
discovery = await self._get_discovery_in_cluster(
user_id,
cluster,
limit=10
)
playlist.extend(discovery)
# Order for good listening flow
playlist = self._order_for_flow(playlist)
return playlist[:50]
def _order_for_flow(self, songs: List[Song]) -> List[Song]:
"""
Order songs for smooth listening experience.
Consider tempo, energy transitions.
"""
if len(songs) <= 1:
return songs
ordered = [songs[0]]
remaining = songs[1:]
while remaining:
last = ordered[-1]
# Find song with smallest "distance" in audio space
best_next = min(
remaining,
key=lambda s: self._audio_distance(last, s)
)
ordered.append(best_next)
remaining.remove(best_next)
return ordered
def _audio_distance(self, song1: Song, song2: Song) -> float:
"""
Calculate distance between songs in audio feature space.
"""
features = ["tempo", "energy", "danceability", "valence"]
distance = 0
for f in features:
diff = song1.audio_features.get(f, 0) - song2.audio_features.get(f, 0)
distance += diff ** 2
return distance ** 0.5
Copy
class RecommendationEngine {
constructor({
songVectors,
userVectors,
audioFeatures,
listeningHistory,
socialGraph
}) {
this.songVectors = songVectors;
this.userVectors = userVectors;
this.audioFeatures = audioFeatures;
this.history = listeningHistory;
this.social = socialGraph;
}
async getPersonalizedRecommendations(userId, context = null, limit = 50) {
const recent = await this.history.getRecentPlays(userId, { days: 30 });
const candidates = [];
// 1. Collaborative filtering (40%)
const cfCandidates = await this.collaborativeFiltering(userId, 100);
candidates.push(...cfCandidates.map(s => ({ song: s, weight: 0.4 })));
// 2. Content-based (30%)
for (const songId of recent.slice(0, 10)) {
const similar = await this.contentBasedSimilar(songId, 20);
candidates.push(...similar.map(s => ({ song: s, weight: 0.3 })));
}
// 3. Audio feature matching (20%)
const tasteProfile = await this.getAudioTasteProfile(userId);
const audioSimilar = await this.audioFeatureMatching(tasteProfile, 50);
candidates.push(...audioSimilar.map(s => ({ song: s, weight: 0.2 })));
// 4. Discovery (10%)
const discovery = await this.getDiscoveryTracks(userId, 20);
candidates.push(...discovery.map(s => ({ song: s, weight: 0.1 })));
// Aggregate scores
const songScores = new Map();
for (const { song, weight } of candidates) {
if (!songScores.has(song.id)) {
songScores.set(song.id, { song, score: 0 });
}
songScores.get(song.id).score += weight;
}
// Apply context
if (context) {
this.applyContextBoost(songScores, context);
}
// Filter and sort
const playedIds = new Set(recent);
const final = [...songScores.values()]
.filter(item => !playedIds.has(item.song.id))
.sort((a, b) => b.score - a.score)
.slice(0, limit);
return final.map(item => item.song);
}
async generateDailyMix(userId, mixIndex) {
const clusters = await this.clusterUserTaste(userId, 6);
if (mixIndex > clusters.length) return [];
const cluster = clusters[mixIndex - 1];
const seeds = cluster.seedTracks.slice(0, 5);
const playlist = [];
// 60% similar to seeds
for (const seedId of seeds) {
const similar = await this.contentBasedSimilar(seedId, 5);
playlist.push(...similar.slice(0, 3));
}
// 20% audio feature matches
const audioMatches = await this.audioFeatureMatching(
cluster.featureCentroid,
10
);
playlist.push(...audioMatches);
// 20% discovery
const discovery = await this.getDiscoveryInCluster(
userId,
cluster,
10
);
playlist.push(...discovery);
// Order for flow
return this.orderForFlow(playlist).slice(0, 50);
}
orderForFlow(songs) {
if (songs.length <= 1) return songs;
const ordered = [songs[0]];
const remaining = [...songs.slice(1)];
while (remaining.length > 0) {
const last = ordered[ordered.length - 1];
let bestIndex = 0;
let bestDistance = Infinity;
for (let i = 0; i < remaining.length; i++) {
const distance = this.audioDistance(last, remaining[i]);
if (distance < bestDistance) {
bestDistance = distance;
bestIndex = i;
}
}
ordered.push(remaining[bestIndex]);
remaining.splice(bestIndex, 1);
}
return ordered;
}
audioDistance(song1, song2) {
const features = ['tempo', 'energy', 'danceability', 'valence'];
let distance = 0;
for (const f of features) {
const diff = (song1.audioFeatures[f] || 0) -
(song2.audioFeatures[f] || 0);
distance += diff * diff;
}
return Math.sqrt(distance);
}
}
module.exports = { RecommendationEngine };
4. Offline Playback
Copy
┌─────────────────────────────────────────────────────────────┐
│ Offline Playback │
├─────────────────────────────────────────────────────────────┤
│ │
│ Download Flow: │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ 1. User requests download │ │
│ │ └── Check: Premium subscription? │ │
│ │ └── Check: Download limit not exceeded? │ │
│ │ │ │
│ │ 2. Queue download in background │ │
│ │ └── Download all quality levels? No, user's pref │ │
│ │ └── Download on WiFi only? User setting │ │
│ │ │ │
│ │ 3. Encrypt audio files │ │
│ │ └── Device-specific encryption key │ │
│ │ └── Key rotated monthly │ │
│ │ │ │
│ │ 4. Store in secure container │ │
│ │ └── Cannot be exported │ │
│ │ └── Playable only in app │ │
│ │ │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
│ Offline Verification: │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ • Must go online every 30 days │ │
│ │ • Verify subscription still active │ │
│ │ • Refresh encryption keys │ │
│ │ • If subscription lapses → downloads become │ │
│ │ unplayable │ │
│ │ │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Data Model
Copy
┌─────────────────────────────────────────────────────────────┐
│ Core Data Models │
├─────────────────────────────────────────────────────────────┤
│ │
│ Songs (Cassandra - high read throughput) │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ song_id | uuid (partition key) │ │
│ │ name | text │ │
│ │ artist_id | uuid │ │
│ │ album_id | uuid │ │
│ │ duration_ms | int │ │
│ │ audio_urls | map<quality, url> │ │
│ │ audio_features | map<feature, float> │ │
│ │ release_date | timestamp │ │
│ │ popularity | int │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
│ User Library (Cassandra) │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ user_id | uuid (partition key) │ │
│ │ item_type | text (song|album|playlist|artist) │ │
│ │ item_id | uuid │ │
│ │ saved_at | timestamp (clustering key DESC) │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
│ Playlists (PostgreSQL - complex queries) │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ playlist_id | uuid │ │
│ │ owner_id | uuid │ │
│ │ name | text │ │
│ │ description | text │ │
│ │ is_public | boolean │ │
│ │ followers | int │ │
│ │ songs | uuid[] (ordered) │ │
│ │ collaborative| boolean │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
│ Listening History (Kafka → Data Lake) │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ user_id | uuid │ │
│ │ song_id | uuid │ │
│ │ played_at | timestamp │ │
│ │ duration_ms | int (how long they listened) │ │
│ │ context | text (playlist, album, radio) │ │
│ │ device_type | text │ │
│ │ skipped | boolean │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Key Design Decisions
CDN Strategy for Audio
Copy
Multi-CDN Strategy:
├── Primary: Fastly (low latency)
├── Secondary: Akamai (backup, different regions)
├── Edge caching for popular songs
└── Origin shield to protect storage
Caching Policy:
├── Popular songs (top 10K): Cache everywhere
├── Medium popularity: Cache in regional PoPs
├── Long tail: Cache on demand, lower TTL
└── Pre-warm caches before major releases
Handling 1M Concurrent Streams
Copy
Scaling Strategy:
├── Stateless streaming servers
├── Audio served directly from CDN
├── Session state in Redis cluster
├── API servers behind load balancer
│
├── Geographic distribution:
│ ├── US: 3 regions
│ ├── EU: 2 regions
│ ├── APAC: 2 regions
│ └── LATAM: 1 region
│
└── Auto-scaling based on:
├── Active sessions
├── Bandwidth utilization
└── Time of day patterns
Interview Tips
Key Discussion Points:
- Audio delivery: How to ensure no buffering?
- Personalization: How to build recommendations?
- Offline mode: How to handle DRM?
- Cross-device sync: How to sync playback state?
- Royalty tracking: How to track plays accurately?