Skip to main content

Problem Statement

Design a ride-hailing service like Uber that:
  • Matches riders with nearby drivers in real-time
  • Tracks driver locations continuously
  • Calculates fare estimates and handles payments
  • Scales to millions of concurrent rides
This is a hard interview problem. Focus on real-time location tracking, efficient geo-matching, and the dispatch algorithm. The location update rate (every 4 seconds) creates significant scale challenges.

Step 1: Requirements

Functional Requirements

Rider Features

  • Request a ride
  • See nearby drivers
  • Track ride in real-time
  • Pay for ride
  • Rate driver

Driver Features

  • Go online/offline
  • Accept/reject rides
  • Navigate to pickup
  • Navigate to destination
  • View earnings

Non-Functional Requirements

  • Low Latency: Match rider in < 10 seconds
  • Real-time: Location updates every 4 seconds
  • High Availability: 99.99% uptime
  • Consistency: No double-booking of drivers

Capacity Estimation

┌─────────────────────────────────────────────────────────────────┐
│                       Uber Scale                                │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Users:                                                         │
│  • 100 million monthly active riders                           │
│  • 5 million drivers                                           │
│  • 20 million rides per day                                    │
│  • 230 rides per second (peak: 500+ RPS)                       │
│                                                                 │
│  Location Updates:                                              │
│  • 2 million active drivers (peak)                             │
│  • Update every 4 seconds                                      │
│  • 2M / 4 = 500,000 updates per second!                        │
│                                                                 │
│  Data per update:                                               │
│  • Driver ID (8 bytes)                                         │
│  • Latitude (8 bytes)                                          │
│  • Longitude (8 bytes)                                         │
│  • Timestamp (8 bytes)                                         │
│  • ~50 bytes per update                                        │
│  • 500K × 50 = 25 MB/second of location data                   │
│                                                                 │
│  Storage (per day):                                             │
│  • 25 MB × 60 × 60 × 24 = 2.16 TB/day of location data        │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Step 2: High-Level Design

┌─────────────────────────────────────────────────────────────────┐
│                   Uber Architecture                             │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│        ┌─────────┐                    ┌─────────┐              │
│        │  Rider  │                    │ Driver  │              │
│        │   App   │                    │   App   │              │
│        └────┬────┘                    └────┬────┘              │
│             │                              │                    │
│             │         WebSocket            │                    │
│             │         connection           │                    │
│             ▼                              ▼                    │
│        ┌─────────────────────────────────────────┐             │
│        │            Load Balancer                │             │
│        └─────────────────────┬───────────────────┘             │
│                              │                                  │
│         ┌────────────────────┼────────────────────┐            │
│         │                    │                    │             │
│    ┌────▼────┐         ┌─────▼─────┐        ┌─────▼─────┐     │
│    │  Trip   │         │ Location  │        │ Matching  │     │
│    │ Service │         │  Service  │        │  Service  │     │
│    └────┬────┘         └─────┬─────┘        └─────┬─────┘     │
│         │                    │                    │             │
│         │                    │                    │             │
│    ┌────▼────┐         ┌─────▼─────┐        ┌─────▼─────┐     │
│    │ Trip DB │         │   Redis   │        │ Matching  │     │
│    │(Postgres│         │(Location) │        │   Queue   │     │
│    └─────────┘         └───────────┘        └───────────┘     │
│                                                                 │
│    Additional Services:                                        │
│    • Payment Service    • Notification Service                 │
│    • Pricing Service    • Maps/ETA Service                     │
│    • User Service       • Rating Service                       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Step 3: Location Tracking

The Challenge

500,000 location updates per second! How do we:
  1. Store current locations efficiently
  2. Query “drivers near point X” quickly
  3. Handle the write throughput

Solution: Geospatial Indexing

┌─────────────────────────────────────────────────────────────────┐
│                   Location Storage                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Option 1: Geohash + Redis                                     │
│  ─────────────────────────                                      │
│                                                                 │
│  Geohash: Encode lat/lng into string                           │
│  • Nearby locations share prefix                               │
│  • "9q8yy" = San Francisco area                               │
│  • "9q8yyb" = smaller area within SF                          │
│                                                                 │
│  Redis Structure:                                               │
│  ┌──────────────────────────────────────────────┐              │
│  │  GEOADD drivers <lng> <lat> driver_id        │              │
│  │  GEOSEARCH drivers FROMMEMBER point          │              │
│  │           BYRADIUS 3 km                      │              │
│  └──────────────────────────────────────────────┘              │
│                                                                 │
│  Option 2: QuadTree (In-Memory)                                │
│  ──────────────────────────────                                 │
│                                                                 │
│         ┌─────────────────────┐                                │
│         │     World Map       │                                │
│         └─────────┬───────────┘                                │
│                   │                                             │
│       ┌───────────┼───────────┐                                │
│       │           │           │                                │
│    ┌──▼──┐     ┌──▼──┐     ┌──▼──┐                            │
│    │ NW  │     │ NE  │     │  ...│                            │
│    └──┬──┘     └─────┘     └─────┘                            │
│       │                                                         │
│   ┌───┼───┐                                                    │
│   │   │   │                                                    │
│  ┌▼┐ ┌▼┐ ┌▼┐                                                  │
│  │ │ │ │ │ │   Keep splitting until                           │
│  └─┘ └─┘ └─┘   few drivers per cell                           │
│                                                                 │
│  Search: Find cell, check neighboring cells                    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Location Service Implementation

class LocationService:
    """
    Handles 500K location updates per second
    Uses Redis GEO commands for efficient geospatial queries
    """
    
    def __init__(self, redis_cluster):
        self.redis = redis_cluster
        # Shard by city for scale
        
    def update_location(self, driver_id: str, lat: float, lng: float):
        """Update driver's current location"""
        city = self.get_city(lat, lng)  # e.g., "san_francisco"
        
        # Redis GEO command - O(log N)
        self.redis.geoadd(
            f"drivers:{city}:available",
            lng, lat, driver_id
        )
        
        # Also store in hash for quick lookup
        self.redis.hset(
            f"driver:{driver_id}",
            mapping={
                "lat": lat,
                "lng": lng,
                "last_updated": time.time()
            }
        )
        
    def find_nearby_drivers(
        self, lat: float, lng: float, 
        radius_km: float = 3.0, limit: int = 10
    ) -> List[Driver]:
        """Find available drivers within radius"""
        city = self.get_city(lat, lng)
        
        # Redis GEOSEARCH - O(N+log(M)) where N = elements returned
        drivers = self.redis.geosearch(
            f"drivers:{city}:available",
            longitude=lng,
            latitude=lat,
            radius=radius_km,
            unit="km",
            count=limit,
            sort="ASC",  # Closest first
            withdist=True,
            withcoord=True
        )
        
        return drivers
        
    def mark_driver_unavailable(self, driver_id: str, city: str):
        """Remove driver from available pool"""
        self.redis.zrem(f"drivers:{city}:available", driver_id)

Step 4: Matching Algorithm

The Dispatch Problem

┌─────────────────────────────────────────────────────────────────┐
│                   Matching Algorithm                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Goal: Match rider to best available driver                    │
│                                                                 │
│  Simple approach: Closest driver                               │
│  Better approach: Consider multiple factors                    │
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                    Matching Score                        │   │
│  │                                                          │   │
│  │  Score = w₁ × Distance +                                │   │
│  │          w₂ × ETA +                                     │   │
│  │          w₃ × Driver Rating +                           │   │
│  │          w₄ × Driver Acceptance Rate +                  │   │
│  │          w₅ × Car Type Match +                          │   │
│  │          w₆ × Driver's Time Since Last Ride             │   │
│  │                                                          │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
│  Process:                                                       │
│  ─────────                                                      │
│  1. Rider requests ride                                        │
│  2. Find N nearest available drivers (N = 10-20)              │
│  3. Score each driver                                          │
│  4. Send request to top driver                                 │
│  5. Wait for response (15-20 seconds)                          │
│  6. If rejected/timeout → try next driver                      │
│  7. If all reject → expand radius and retry                    │
│                                                                 │
│  Uber's approach: Batched matching                             │
│  ─────────────────────────────────────                          │
│  Every 2 seconds, run matching for ALL pending requests        │
│  Optimize globally (Hungarian algorithm)                        │
│  Better matches overall, slight delay acceptable               │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Matching Service

class MatchingService:
    """
    Matches riders with optimal drivers
    """
    
    def match_rider(self, ride_request: RideRequest) -> Optional[Driver]:
        # 1. Find nearby available drivers
        nearby_drivers = self.location_service.find_nearby_drivers(
            ride_request.pickup_lat,
            ride_request.pickup_lng,
            radius_km=5.0,
            limit=20
        )
        
        if not nearby_drivers:
            return None
            
        # 2. Score each driver
        scored_drivers = []
        for driver in nearby_drivers:
            score = self.calculate_match_score(ride_request, driver)
            scored_drivers.append((driver, score))
        
        # 3. Sort by score (higher is better)
        scored_drivers.sort(key=lambda x: x[1], reverse=True)
        
        # 4. Try each driver in order
        for driver, score in scored_drivers:
            accepted = self.send_ride_request(driver, ride_request)
            if accepted:
                return driver
                
        return None
    
    def calculate_match_score(
        self, request: RideRequest, driver: Driver
    ) -> float:
        # ETA from driver to pickup
        eta = self.maps_service.get_eta(
            driver.lat, driver.lng,
            request.pickup_lat, request.pickup_lng
        )
        
        # Scoring weights
        score = 0
        score += (1 / eta) * 30              # Lower ETA = higher score
        score += driver.rating * 20           # 4.9 rating = 98 points
        score += driver.acceptance_rate * 15  # 90% = 13.5 points
        score += driver.idle_time * 0.5       # Fair distribution
        
        return score

Step 5: Trip State Machine

┌─────────────────────────────────────────────────────────────────┐
│                    Trip State Machine                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌──────────┐                                                  │
│  │REQUESTED │ ───Matched───► ┌──────────┐                     │
│  └──────────┘                │ ACCEPTED │                      │
│       │                      └────┬─────┘                      │
│       │                           │                             │
│  No match                    Driver arrives                     │
│       │                           │                             │
│       ▼                           ▼                             │
│  ┌──────────┐               ┌──────────┐                       │
│  │CANCELLED │               │ ARRIVED  │                       │
│  └──────────┘               └────┬─────┘                       │
│       ▲                          │                              │
│       │                     Rider boards                        │
│  Cancel                          │                              │
│       │                          ▼                              │
│       │                    ┌──────────┐                        │
│       ├────────────────────│IN_PROGRESS│                       │
│       │                    └────┬─────┘                        │
│       │                         │                               │
│       │                    Reach destination                    │
│       │                         │                               │
│       │                         ▼                               │
│       │                   ┌──────────┐                         │
│       │                   │COMPLETED │                         │
│       │                   └────┬─────┘                         │
│       │                        │                                │
│       │                   Payment processed                     │
│       │                        │                                │
│       │                        ▼                                │
│       │                   ┌──────────┐                         │
│       └───────────────────│   PAID   │                         │
│                           └──────────┘                         │
│                                                                 │
│  Each state change:                                            │
│  • Update database                                             │
│  • Notify rider & driver via WebSocket                         │
│  • Emit event for analytics                                    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Step 6: Pricing & Surge

Dynamic Pricing

┌─────────────────────────────────────────────────────────────────┐
│                    Surge Pricing                                │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Why surge? Balance supply and demand                          │
│                                                                 │
│  Demand > Supply → Increase prices → More drivers come         │
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │              Supply/Demand by Area                       │   │
│  │                                                          │   │
│  │  Area        Riders    Drivers    Ratio    Surge        │   │
│  │  ─────       ──────    ───────    ─────    ─────        │   │
│  │  Downtown    500       100        5.0      2.5x         │   │
│  │  Airport     200       80         2.5      1.5x         │   │
│  │  Suburbs     50        60         0.83     1.0x         │   │
│  │                                                          │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
│  Surge calculation:                                             │
│                                                                 │
│  demand_ratio = active_requests / available_drivers            │
│                                                                 │
│  if demand_ratio <= 1.0:                                       │
│      surge = 1.0                                               │
│  elif demand_ratio <= 2.0:                                     │
│      surge = 1.0 + (demand_ratio - 1) * 0.5  # 1.0 - 1.5x     │
│  elif demand_ratio <= 4.0:                                     │
│      surge = 1.5 + (demand_ratio - 2) * 0.5  # 1.5 - 2.5x     │
│  else:                                                         │
│      surge = min(demand_ratio * 0.5, 5.0)    # Cap at 5x      │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Fare Calculation

class PricingService:
    def calculate_fare(
        self, 
        pickup: Location, 
        dropoff: Location,
        car_type: str = "UberX"
    ) -> FareEstimate:
        
        # Get route details
        route = self.maps_service.get_route(pickup, dropoff)
        distance_km = route.distance_km
        duration_min = route.duration_min
        
        # Base fare components
        rates = self.get_rates(car_type)
        
        base_fare = rates.base_fare                    # $2.50
        distance_fare = distance_km * rates.per_km    # $1.50/km
        time_fare = duration_min * rates.per_min      # $0.20/min
        booking_fee = rates.booking_fee               # $2.00
        
        subtotal = base_fare + distance_fare + time_fare + booking_fee
        
        # Apply surge
        surge_multiplier = self.get_surge_multiplier(pickup)
        total = subtotal * surge_multiplier
        
        # Apply minimum fare
        total = max(total, rates.minimum_fare)
        
        return FareEstimate(
            low=total * 0.9,
            high=total * 1.1,
            surge=surge_multiplier,
            breakdown={
                "base": base_fare,
                "distance": distance_fare,
                "time": time_fare,
                "booking_fee": booking_fee,
                "surge": surge_multiplier
            }
        )

Step 7: Data Models

-- Users table
CREATE TABLE users (
    id              UUID PRIMARY KEY,
    type            VARCHAR(10),  -- 'rider' or 'driver'
    name            VARCHAR(100),
    email           VARCHAR(100) UNIQUE,
    phone           VARCHAR(20) UNIQUE,
    rating          DECIMAL(3,2) DEFAULT 5.0,
    created_at      TIMESTAMP
);

-- Drivers additional info
CREATE TABLE driver_profiles (
    user_id         UUID PRIMARY KEY REFERENCES users(id),
    license_number  VARCHAR(50),
    car_model       VARCHAR(100),
    car_plate       VARCHAR(20),
    car_type        VARCHAR(20),  -- 'UberX', 'UberXL', 'Black'
    is_online       BOOLEAN DEFAULT FALSE,
    acceptance_rate DECIMAL(3,2) DEFAULT 1.0
);

-- Trips table
CREATE TABLE trips (
    id              UUID PRIMARY KEY,
    rider_id        UUID REFERENCES users(id),
    driver_id       UUID REFERENCES users(id),
    status          VARCHAR(20),
    
    pickup_lat      DECIMAL(10, 7),
    pickup_lng      DECIMAL(10, 7),
    pickup_address  VARCHAR(255),
    
    dropoff_lat     DECIMAL(10, 7),
    dropoff_lng     DECIMAL(10, 7),
    dropoff_address VARCHAR(255),
    
    estimated_fare  DECIMAL(10, 2),
    actual_fare     DECIMAL(10, 2),
    surge_multiplier DECIMAL(3, 2) DEFAULT 1.0,
    
    distance_km     DECIMAL(8, 2),
    duration_min    INT,
    
    requested_at    TIMESTAMP,
    accepted_at     TIMESTAMP,
    started_at      TIMESTAMP,
    completed_at    TIMESTAMP,
    
    rider_rating    INT,
    driver_rating   INT
);

-- Location history (for route reconstruction, ETAs)
CREATE TABLE location_history (
    driver_id       UUID,
    timestamp       TIMESTAMP,
    lat             DECIMAL(10, 7),
    lng             DECIMAL(10, 7),
    PRIMARY KEY (driver_id, timestamp)
) WITH (timescaledb.continuous);
-- Use TimescaleDB for efficient time-series storage

Final Architecture

┌─────────────────────────────────────────────────────────────────┐
│                 Complete Uber Architecture                      │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│          ┌─────────────┐              ┌─────────────┐          │
│          │  Rider App  │              │ Driver App  │          │
│          └──────┬──────┘              └──────┬──────┘          │
│                 │                            │                  │
│                 │      WebSocket             │                  │
│                 └────────────┬───────────────┘                  │
│                              │                                  │
│                    ┌─────────▼─────────┐                       │
│                    │   Load Balancer   │                       │
│                    └─────────┬─────────┘                       │
│                              │                                  │
│    ┌─────────────────────────┼─────────────────────────┐       │
│    │          │         │    │    │         │          │        │
│    ▼          ▼         ▼    ▼    ▼         ▼          ▼        │
│ ┌──────┐ ┌────────┐ ┌──────┐│┌──────┐┌────────┐┌────────┐      │
│ │ Trip │ │Location│ │Match ││ │Price│ │Payment │ │  User  │     │
│ │ Svc  │ │  Svc   │ │ Svc  ││ │ Svc │ │  Svc   │ │  Svc   │     │
│ └──┬───┘ └───┬────┘ └──┬───┘│└──┬───┘ └───┬────┘ └───┬────┘     │
│    │         │         │    │    │         │          │         │
│    ▼         ▼         ▼    ▼    ▼         ▼          ▼         │
│ ┌──────┐ ┌────────┐ ┌──────┐│┌──────┐ ┌────────┐ ┌────────┐    │
│ │Postgres│ │ Redis  │ │Kafka ││ │Cache │ │ Stripe │ │Postgres│   │
│ │(Trips) │ │(Geo)   │ │      ││ │      │ │        │ │(Users) │   │
│ └────────┘ └────────┘ └──────┘│└──────┘ └────────┘ └────────┘   │
│                               │                                 │
│                    ┌──────────▼──────────┐                     │
│                    │    Maps Service     │                     │
│                    │  (ETA, Directions)  │                     │
│                    │  Google/Mapbox API  │                     │
│                    └─────────────────────┘                     │
│                                                                 │
│  Real-time Push:                                               │
│  • WebSocket for rider/driver updates                          │
│  • Location updates every 4 seconds                            │
│  • Trip status changes pushed instantly                        │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Key Design Decisions

DecisionChoiceReasoning
Location StoreRedis + Geo commandsFast writes, efficient geo queries
MatchingScore-based + batchBalance speed and optimality
Real-timeWebSocketsBidirectional, low latency
Trip DBPostgreSQLACID for payments, consistency
AnalyticsKafka + TimescaleDBTime-series location data
MapsExternal APIDon’t reinvent Google Maps

Common Interview Questions

  1. Use distributed lock when assigning (Redis SETNX)
  2. Mark driver unavailable atomically before sending request
  3. Use database transaction with driver status check
  4. If request times out, release lock and mark available again
  5. Idempotency key for each matching attempt
  1. Detect: No location updates for 30+ seconds
  2. Notify rider: “Driver connectivity issue”
  3. If prolonged: Auto-cancel and rematch
  4. Trip marked as “interrupted” - rider not charged
  5. Support ticket auto-created
  6. Driver rating affected if happens frequently
  1. Shard Redis by city/region
  2. Use Redis Cluster for horizontal scaling
  3. Client-side throttling (don’t update if < 10m moved)
  4. UDP for location updates (some loss OK)
  5. Batch updates on server side
  6. Drop stale updates (older than 10 seconds)
  1. Get route from Maps API (Google/Mapbox)
  2. Consider current traffic conditions
  3. Historical data for time-of-day patterns
  4. Adjust for driver behavior (driving style)
  5. Update ETA as driver moves
  6. Cache common routes
  1. GPS spoofing detection (impossible movements)
  2. Verify rider/driver at same location at pickup
  3. Monitor unusual patterns (circular routes)
  4. Photo verification at start
  5. ML model for anomaly detection
  6. Manual review queue for flagged rides