Skip to main content

Problem Statement

Design a Twitter-like social media platform that:
  • Users can post tweets (280 characters)
  • Users can follow other users
  • Users see a timeline of tweets from people they follow
  • Support for likes, retweets, and replies

Step 1: Requirements Clarification

Functional Requirements

Core Features

  • Post tweets (text, images, videos)
  • Follow/unfollow users
  • View home timeline (feed)
  • Like, retweet, reply
  • User profiles

Extended Features

  • Search tweets
  • Trending topics
  • Notifications
  • Direct messages

Non-Functional Requirements

  • Low Latency: Timeline loads in <200ms
  • High Availability: 99.99% uptime
  • Eventual Consistency: Acceptable for feeds
  • Scale: 500M users, 200M DAU

Capacity Estimation

┌─────────────────────────────────────────────────────────────────┐
│                 Twitter Scale Estimation                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Users:                                                         │
│  • 500 million total users                                     │
│  • 200 million DAU                                             │
│  • Average: 200 followers per user                             │
│  • Power users: 10M+ followers (celebrities)                   │
│                                                                 │
│  Tweets:                                                        │
│  • 10% users tweet daily = 20M tweets/day                      │
│  • Tweet QPS = 20M / 86,400 ≈ 230 QPS                          │
│  • Peak: 230 × 3 = 700 QPS                                     │
│                                                                 │
│  Timeline reads:                                                │
│  • 200M DAU × 5 views/day = 1B views/day                       │
│  • Timeline QPS = 1B / 86,400 ≈ 11,600 QPS                     │
│  • Peak: 11,600 × 3 = 35,000 QPS                               │
│                                                                 │
│  Read:Write ratio = 11,600:230 = 50:1                          │
│                                                                 │
│  Storage:                                                       │
│  • Tweet: 280 chars + metadata = 500 bytes                     │
│  • Daily: 20M × 500 = 10 GB                                    │
│  • Yearly: 10 GB × 365 = 3.6 TB                                │
│  • Media: 10% tweets with 1MB image = 2M × 1MB = 2 TB/day     │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Step 2: High-Level Design

┌─────────────────────────────────────────────────────────────────┐
│                    Twitter Architecture                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│                       ┌────────────┐                           │
│                       │   Client   │                           │
│                       └─────┬──────┘                           │
│                             │                                   │
│                       ┌─────▼──────┐                           │
│                       │    CDN     │                           │
│                       └─────┬──────┘                           │
│                             │                                   │
│                       ┌─────▼──────┐                           │
│                       │ API Gateway│                           │
│                       └─────┬──────┘                           │
│                             │                                   │
│    ┌────────────────────────┼────────────────────────┐         │
│    │                        │                        │          │
│    ▼                        ▼                        ▼          │
│ ┌──────────┐         ┌──────────────┐         ┌──────────┐    │
│ │  Tweet   │         │   Timeline   │         │  User    │    │
│ │ Service  │         │   Service    │         │ Service  │    │
│ └────┬─────┘         └──────┬───────┘         └────┬─────┘    │
│      │                      │                      │           │
│      │                      │                      │           │
│ ┌────▼─────┐         ┌──────▼───────┐         ┌────▼─────┐    │
│ │Tweet DB  │         │Timeline Cache│         │ User DB  │    │
│ │(Cassandra)│         │   (Redis)    │         │(Postgres)│    │
│ └──────────┘         └──────────────┘         └──────────┘    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Core Services

ServiceResponsibility
Tweet ServiceCreate, read, delete tweets
Timeline ServiceBuild and serve user timelines
User ServiceUser profiles, follow relationships
Fan-out ServiceDistribute tweets to followers
Search ServiceFull-text search on tweets
Notification ServicePush notifications

Step 3: The Timeline Problem

The core challenge is: How do we show a user the latest tweets from everyone they follow?

Approach 1: Fan-out on Read (Pull)

┌─────────────────────────────────────────────────────────────────┐
│                    Fan-out on Read                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  When user opens timeline:                                      │
│  ────────────────────────                                       │
│  1. Get list of followees (200 users)                          │
│  2. For each followee, get recent tweets                       │
│  3. Merge and sort by timestamp                                │
│  4. Return top N tweets                                         │
│                                                                 │
│  User                                                           │
│    │                                                            │
│    ▼                                                            │
│  "Get timeline"                                                │
│    │                                                            │
│    ▼                                                            │
│  ┌─────────────────────────────────────────┐                   │
│  │  SELECT tweets FROM tweet_table         │                   │
│  │  WHERE user_id IN (followee_ids)        │                   │
│  │  ORDER BY created_at DESC               │                   │
│  │  LIMIT 100                              │                   │
│  └─────────────────────────────────────────┘                   │
│                                                                 │
│  + Simple to implement                                         │
│  + No extra storage                                            │
│  - Slow: 200 queries per timeline request                     │
│  - 11,600 QPS × 200 = 2.3M queries/second!                    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Approach 2: Fan-out on Write (Push)

┌─────────────────────────────────────────────────────────────────┐
│                    Fan-out on Write                             │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  When user posts a tweet:                                       │
│  ────────────────────────                                       │
│  1. Save tweet to database                                     │
│  2. Get all followers (could be millions)                      │
│  3. Push tweet ID to each follower's timeline cache            │
│                                                                 │
│  Tweet Posted                                                   │
│    │                                                            │
│    ▼                                                            │
│  ┌─────────────┐                                               │
│  │ Save Tweet  │                                               │
│  └──────┬──────┘                                               │
│         │                                                       │
│    ┌────┴────┐                                                 │
│    ▼         ▼                                                 │
│  Fan-out to each follower's timeline cache:                    │
│                                                                 │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐          │
│  │Follower 1│ │Follower 2│ │Follower 3│ │    ...   │          │
│  │ Timeline │ │ Timeline │ │ Timeline │ │          │          │
│  │[tweet_id]│ │[tweet_id]│ │[tweet_id]│ │[tweet_id]│          │
│  └──────────┘ └──────────┘ └──────────┘ └──────────┘          │
│                                                                 │
│  + Fast reads: O(1) cache lookup                              │
│  + Timeline is pre-computed                                    │
│  - Celebrity problem: 50M followers = 50M writes              │
│  - Wasted writes for inactive users                           │
│  - More storage (timeline caches)                             │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Approach 3: Hybrid (What Twitter Uses)

┌─────────────────────────────────────────────────────────────────┐
│                    Hybrid Approach                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Regular Users (< 10K followers):                              │
│  ──────────────────────────────                                 │
│  → Fan-out on Write                                            │
│  → Push to all followers' timelines                            │
│                                                                 │
│  Celebrities (> 10K followers):                                 │
│  ─────────────────────────────                                  │
│  → Fan-out on Read                                             │
│  → Fetch at read time and merge                                │
│                                                                 │
│  Timeline Generation:                                           │
│  ────────────────────                                           │
│  1. Read user's pre-computed timeline (regular users' tweets)  │
│  2. Get list of celebrity followees                            │
│  3. Fetch recent tweets from each celebrity                    │
│  4. Merge both lists, sort by time                             │
│  5. Return top N                                                │
│                                                                 │
│         User's Timeline                                         │
│              │                                                  │
│    ┌─────────┴─────────┐                                       │
│    │                   │                                        │
│    ▼                   ▼                                        │
│  ┌───────────┐   ┌───────────────┐                             │
│  │ Timeline  │   │  Celebrity    │                             │
│  │  Cache    │   │   Tweets      │                             │
│  │ (pushed)  │   │  (pulled)     │                             │
│  └─────┬─────┘   └───────┬───────┘                             │
│        │                 │                                      │
│        └────────┬────────┘                                      │
│                 │                                               │
│           ┌─────▼─────┐                                        │
│           │   Merge   │                                        │
│           │   & Sort  │                                        │
│           └───────────┘                                        │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Step 4: Detailed Component Design

Data Models

-- Users table
CREATE TABLE users (
    user_id         BIGINT PRIMARY KEY,
    username        VARCHAR(50) UNIQUE,
    email           VARCHAR(255),
    display_name    VARCHAR(100),
    bio             TEXT,
    profile_image   VARCHAR(500),
    followers_count BIGINT DEFAULT 0,
    following_count BIGINT DEFAULT 0,
    created_at      TIMESTAMP
);

-- Tweets table (Cassandra)
CREATE TABLE tweets (
    tweet_id        BIGINT,         -- Snowflake ID
    user_id         BIGINT,
    content         TEXT,
    media_urls      LIST<TEXT>,
    created_at      TIMESTAMP,
    likes_count     BIGINT,
    retweets_count  BIGINT,
    replies_count   BIGINT,
    PRIMARY KEY (user_id, tweet_id)
) WITH CLUSTERING ORDER BY (tweet_id DESC);

-- Follow relationships (Graph or Cassandra)
CREATE TABLE followers (
    user_id         BIGINT,
    follower_id     BIGINT,
    created_at      TIMESTAMP,
    PRIMARY KEY (user_id, follower_id)
);

CREATE TABLE following (
    user_id         BIGINT,
    following_id    BIGINT,
    created_at      TIMESTAMP,
    PRIMARY KEY (user_id, following_id)
);

Timeline Cache Structure

# Redis sorted set for each user's timeline
# Key: timeline:{user_id}
# Score: tweet timestamp
# Value: tweet_id

# Add tweet to timeline
ZADD timeline:123 1705312800 "tweet_456789"

# Get latest 50 tweets
ZREVRANGE timeline:123 0 49

# Trim old tweets (keep only 800)
ZREMRANGEBYRANK timeline:123 0 -801

# Timeline structure
{
    "timeline:123": {
        "tweet_999": 1705312800,  # Most recent
        "tweet_998": 1705312700,
        "tweet_997": 1705312600,
        ...
        # Store last 800 tweet IDs
    }
}

# Full tweet data cached separately
{
    "tweet:999": {
        "user_id": 456,
        "content": "Hello world!",
        "created_at": 1705312800,
        "likes": 42
    }
}

Fan-out Service

class FanoutService:
    def __init__(self, follower_service, timeline_cache, message_queue):
        self.follower_service = follower_service
        self.timeline_cache = timeline_cache
        self.queue = message_queue
    
    def fanout_tweet(self, tweet):
        user_id = tweet.user_id
        followers = self.follower_service.get_followers(user_id)
        
        # Check if user is a celebrity
        if len(followers) > 10_000:
            # Don't fan out for celebrities
            # Store tweet in celebrity tweets cache instead
            self.cache_celebrity_tweet(user_id, tweet)
            return
        
        # Fan-out to all followers
        for batch in self.batch(followers, 1000):
            # Process in parallel
            self.queue.publish("fanout", {
                "tweet_id": tweet.id,
                "tweet_time": tweet.created_at,
                "follower_ids": batch
            })
    
    def process_fanout_batch(self, message):
        tweet_id = message["tweet_id"]
        tweet_time = message["tweet_time"]
        
        pipeline = self.timeline_cache.pipeline()
        for follower_id in message["follower_ids"]:
            # Add to timeline sorted set
            pipeline.zadd(
                f"timeline:{follower_id}",
                {tweet_id: tweet_time}
            )
            # Trim to 800 tweets
            pipeline.zremrangebyrank(f"timeline:{follower_id}", 0, -801)
        
        pipeline.execute()

Timeline Service

class TimelineService:
    def __init__(self, timeline_cache, tweet_service, follow_service):
        self.cache = timeline_cache
        self.tweet_service = tweet_service
        self.follow_service = follow_service
    
    def get_timeline(self, user_id, count=50, cursor=None):
        # 1. Get pre-computed timeline (from regular users)
        if cursor:
            timeline_ids = self.cache.zrevrangebyscore(
                f"timeline:{user_id}",
                cursor, 
                "-inf",
                start=0, 
                num=count
            )
        else:
            timeline_ids = self.cache.zrevrange(
                f"timeline:{user_id}",
                0, 
                count - 1
            )
        
        # 2. Get celebrity tweets (fan-out on read)
        celebrity_followees = self.follow_service.get_celebrity_followees(user_id)
        celebrity_tweets = []
        
        for celeb_id in celebrity_followees:
            recent_tweets = self.tweet_service.get_recent_tweets(
                celeb_id, 
                count=10
            )
            celebrity_tweets.extend(recent_tweets)
        
        # 3. Merge and sort
        all_tweet_ids = list(timeline_ids) + [t.id for t in celebrity_tweets]
        
        # Fetch full tweet objects
        tweets = self.tweet_service.get_tweets_batch(all_tweet_ids)
        
        # Sort by time and take top N
        tweets.sort(key=lambda t: t.created_at, reverse=True)
        return tweets[:count]

Tweet Search with Elasticsearch

┌─────────────────────────────────────────────────────────────────┐
│                    Search Architecture                          │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Tweet Created                                                  │
│      │                                                          │
│      ▼                                                          │
│  ┌──────────┐     ┌──────────────┐     ┌───────────────┐       │
│  │  Kafka   │────►│   Consumer   │────►│ Elasticsearch │       │
│  │          │     │   Service    │     │               │       │
│  └──────────┘     └──────────────┘     └───────┬───────┘       │
│                                                │                │
│                                                ▼                │
│                              ┌─────────────────────────────┐   │
│                              │ GET /tweets/_search         │   │
│                              │ {                           │   │
│                              │   "query": {                │   │
│                              │     "match": {              │   │
│                              │       "content": "keyword"  │   │
│                              │     }                       │   │
│                              │   }                         │   │
│                              │ }                           │   │
│                              └─────────────────────────────┘   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘
class TrendingService:
    def __init__(self, redis_client):
        self.redis = redis_client
    
    def track_hashtag(self, hashtag, location="global"):
        """Called when a tweet with hashtag is posted"""
        current_window = self.get_current_window()
        
        # Increment counter in current time window
        self.redis.zincrby(
            f"trending:{location}:{current_window}",
            1,
            hashtag
        )
    
    def get_trending(self, location="global", count=10):
        """Get top trending hashtags"""
        # Get last 5 time windows (e.g., 5 minute windows)
        windows = self.get_recent_windows(5)
        
        # Merge counts from recent windows
        self.redis.zunionstore(
            f"trending:{location}:merged",
            [f"trending:{location}:{w}" for w in windows]
        )
        
        # Get top hashtags
        return self.redis.zrevrange(
            f"trending:{location}:merged",
            0,
            count - 1,
            withscores=True
        )
    
    def get_current_window(self):
        """5-minute time windows"""
        import time
        return int(time.time() / 300) * 300

Step 6: Additional Components

Media Upload

┌─────────────────────────────────────────────────────────────────┐
│                    Media Upload Flow                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  1. Client requests upload URL                                 │
│     POST /api/v1/media/upload-url                              │
│                                                                 │
│  2. Server returns pre-signed S3 URL                           │
│     { "upload_url": "https://s3.../presigned", "media_id": 123}│
│                                                                 │
│  3. Client uploads directly to S3                              │
│     PUT https://s3.../presigned                                │
│                                                                 │
│  4. S3 triggers processing lambda                              │
│     - Generate thumbnails                                      │
│     - Transcode video                                          │
│     - CDN propagation                                          │
│                                                                 │
│  5. Client includes media_id in tweet                          │
│     POST /api/v1/tweets                                        │
│     { "content": "Hello!", "media_ids": [123] }               │
│                                                                 │
│  ┌────────┐    ┌─────────┐    ┌─────────┐    ┌─────────┐      │
│  │ Client │───►│   API   │───►│   S3    │───►│   CDN   │      │
│  └────────┘    └─────────┘    └─────────┘    └─────────┘      │
│                    │              │                            │
│                    │              ▼                            │
│                    │         ┌─────────┐                       │
│                    │         │ Lambda  │                       │
│                    │         │ Process │                       │
│                    │         └─────────┘                       │
│                    │              │                            │
│                    │              ▼                            │
│                    │         ┌─────────┐                       │
│                    └────────►│ Media   │                       │
│                              │   DB    │                       │
│                              └─────────┘                       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Notifications

class NotificationService:
    def __init__(self, push_service, websocket_service, email_service):
        self.push = push_service
        self.ws = websocket_service
        self.email = email_service
    
    def notify_mention(self, mentioned_user_id, tweet):
        notification = {
            "type": "mention",
            "from_user": tweet.user_id,
            "tweet_id": tweet.id,
            "content": f"@{tweet.user.username} mentioned you"
        }
        
        self.send_notification(mentioned_user_id, notification)
    
    def notify_like(self, tweet_owner_id, liker_id, tweet_id):
        # Batch likes to avoid notification spam
        # "5 people liked your tweet"
        pass
    
    def send_notification(self, user_id, notification):
        # 1. Store in notifications table
        self.store_notification(user_id, notification)
        
        # 2. Send real-time if user is online
        if self.ws.is_connected(user_id):
            self.ws.send(user_id, notification)
        
        # 3. Send push notification
        self.push.send(user_id, notification)

Final Architecture

┌─────────────────────────────────────────────────────────────────┐
│                 Complete Twitter Architecture                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│                         ┌──────────────┐                       │
│                         │   Clients    │                       │
│                         └──────┬───────┘                       │
│                                │                                │
│                         ┌──────▼───────┐                       │
│                         │     CDN      │                       │
│                         └──────┬───────┘                       │
│                                │                                │
│                         ┌──────▼───────┐                       │
│                         │ API Gateway  │                       │
│                         └──────┬───────┘                       │
│                                │                                │
│    ┌───────────────────────────┼───────────────────────────┐   │
│    │            │              │              │            │    │
│    ▼            ▼              ▼              ▼            ▼    │
│ ┌──────┐   ┌──────┐      ┌──────────┐   ┌──────┐    ┌──────┐  │
│ │Tweet │   │ User │      │ Timeline │   │Search│    │Notif │  │
│ │ Svc  │   │ Svc  │      │   Svc    │   │ Svc  │    │ Svc  │  │
│ └──┬───┘   └──┬───┘      └────┬─────┘   └──┬───┘    └──┬───┘  │
│    │          │               │            │           │       │
│    │          │               │            │           │       │
│    ▼          ▼               ▼            ▼           ▼       │
│ ┌──────┐  ┌──────┐      ┌──────────┐  ┌───────┐  ┌─────────┐  │
│ │Tweet │  │ User │      │ Timeline │  │Elastic│  │  Push   │  │
│ │  DB  │  │  DB  │      │  Cache   │  │Search │  │ Service │  │
│ │      │  │      │      │ (Redis)  │  │       │  │         │  │
│ └──────┘  └──────┘      └──────────┘  └───────┘  └─────────┘  │
│                                                                 │
│    ┌───────────────────────────────────────────────────────┐   │
│    │                    Kafka                              │   │
│    │  (tweets, fanout, notifications, search indexing)     │   │
│    └───────────────────────────────────────────────────────┘   │
│                                                                 │
│    ┌───────────────────────────────────────────────────────┐   │
│    │                 S3 + CDN (Media)                      │   │
│    └───────────────────────────────────────────────────────┘   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Key Design Decisions

DecisionChoiceReasoning
TimelineHybrid fan-outBalance between read/write costs
Tweet StorageCassandraHigh write throughput, time-series friendly
User DataPostgreSQLACID for user operations
Timeline CacheRedis Sorted SetsO(log n) insert, O(1) range reads
SearchElasticsearchFull-text search, real-time indexing
MediaS3 + CDNScalable object storage
MessagingKafkaDurable, high-throughput event streaming

Common Interview Questions

Use hybrid approach: fan-out on write for regular users, fan-out on read for celebrities (>10K followers). Celebrity tweets are fetched at read time and merged with the pre-computed timeline.
  1. Pre-compute timelines via fan-out on write
  2. Short TTL on cache (5 minutes)
  3. Use WebSocket for real-time updates
  4. Periodic refresh on client side
  1. Mark tweet as deleted in DB (soft delete)
  2. Async job to remove from all timeline caches
  3. Client-side filtering as backup
  4. Accept eventual consistency (deleted tweets may briefly appear)
  1. Chronological (simple, predictable)
  2. Algorithmic ranking (engagement prediction)
  3. Hybrid: recent tweets chronologically, older by relevance
  4. Store ranking score in timeline cache