Skip to main content
Staff+ Level: Global architecture is expected knowledge for Staff+ roles. You should be able to discuss multi-region trade-offs fluently.

Why Go Global?

┌─────────────────────────────────────────────────────────────────┐
│                   REASONS FOR GLOBAL SCALE                      │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  1. LATENCY                                                     │
│     US East → US West: ~70ms                                   │
│     US → Europe: ~100ms                                        │
│     US → Asia: ~200ms                                          │
│                                                                 │
│     Users expect <100ms latency. Physics limits us.            │
│     Solution: Bring compute closer to users.                   │
│                                                                 │
│  2. DISASTER RECOVERY                                           │
│     Single region can have:                                    │
│     • Power outages                                            │
│     • Network failures                                         │
│     • Natural disasters                                        │
│                                                                 │
│     Multi-region = survive regional failures                   │
│                                                                 │
│  3. DATA SOVEREIGNTY                                            │
│     • GDPR: EU data must stay in EU                           │
│     • China: Data must be stored locally                       │
│     • Russia, India: Various requirements                      │
│                                                                 │
│  4. BUSINESS CONTINUITY                                         │
│     • 99.99% availability needs multiple regions               │
│     • Single region max: ~99.9% (8.7 hours/year)              │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Multi-Region Architecture Patterns

Pattern 1: Active-Passive (Disaster Recovery)

┌─────────────────────────────────────────────────────────────────┐
│                    ACTIVE-PASSIVE                               │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│         Primary Region              Secondary Region            │
│         (Active)                    (Passive)                   │
│                                                                 │
│         ┌──────────┐                ┌──────────┐               │
│         │ Traffic  │                │  Standby │               │
│         │    ▼     │   Replication  │          │               │
│  Users ─► Services ├───────────────►│ Services │               │
│         │    ▼     │                │          │               │
│         │    DB    │───────────────►│    DB    │               │
│         └──────────┘    (async)     └──────────┘               │
│                                                                 │
│  RPO (Recovery Point Objective): Minutes (data loss)           │
│  RTO (Recovery Time Objective): Minutes to hours (downtime)    │
│                                                                 │
│  ✓ Pros: Simple, cost-effective                                │
│  ✗ Cons: Wasted passive capacity, failover complexity          │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Pattern 2: Active-Active (Multi-Region)

┌─────────────────────────────────────────────────────────────────┐
│                    ACTIVE-ACTIVE                                │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│        US-EAST                         EU-WEST                  │
│        ┌──────────┐                   ┌──────────┐             │
│  US ───► Services │◄─────────────────►│ Services │◄─── EU      │
│  Users │    ▼     │  Bi-directional  │    ▼     │    Users    │
│        │    DB    │◄─────────────────►│    DB    │             │
│        └──────────┘   Replication    └──────────┘             │
│                                                                 │
│  RPO: Near-zero (bi-directional sync)                          │
│  RTO: Seconds (automatic failover)                             │
│                                                                 │
│  ✓ Pros: Better latency, no wasted capacity                   │
│  ✗ Cons: Conflict resolution, data consistency complexity     │
│                                                                 │
│  Challenge: What if US and EU both update the same record?    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Pattern 3: Read Local, Write Global

┌─────────────────────────────────────────────────────────────────┐
│                READ LOCAL, WRITE GLOBAL                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│        US-EAST                         EU-WEST                  │
│        ┌──────────┐                   ┌──────────┐             │
│  Local │  Read    │                   │  Read    │ Local       │
│  Reads │ Replicas │                   │ Replicas │ Reads       │
│        └────┬─────┘                   └────┬─────┘             │
│             │                              │                    │
│             └──────────┬───────────────────┘                   │
│                        ▼                                        │
│                   ┌────────────┐                               │
│                   │   Write    │ All writes                    │
│                   │   Leader   │ go here                       │
│                   │  (US-EAST) │                               │
│                   └────────────┘                               │
│                                                                 │
│  ✓ Pros: Simple consistency model, low read latency           │
│  ✗ Cons: High write latency for non-primary regions           │
│                                                                 │
│  Good for: Read-heavy workloads (social media, content)        │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Pattern 4: Geo-Partitioning

┌─────────────────────────────────────────────────────────────────┐
│                    GEO-PARTITIONING                             │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Route users to their home region based on data residency      │
│                                                                 │
│     US Users          EU Users          APAC Users             │
│         │                 │                 │                   │
│         ▼                 ▼                 ▼                   │
│    ┌─────────┐       ┌─────────┐       ┌─────────┐            │
│    │ US-EAST │       │ EU-WEST │       │ AP-SOUTH│            │
│    │ Region  │       │ Region  │       │ Region  │            │
│    │         │       │         │       │         │            │
│    │ US Data │       │ EU Data │       │APAC Data│            │
│    └─────────┘       └─────────┘       └─────────┘            │
│                                                                 │
│  ✓ Pros: Data sovereignty, locality, independence              │
│  ✗ Cons: Cross-region queries complex, no global view         │
│                                                                 │
│  Good for: Compliance-heavy industries (healthcare, finance)   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Data Replication Strategies

Synchronous vs Asynchronous Replication

from enum import Enum
from dataclasses import dataclass
from typing import List, Dict, Optional
import asyncio
import time

class ReplicationMode(Enum):
    SYNC = "synchronous"       # Wait for all replicas
    SEMI_SYNC = "semi-sync"    # Wait for N replicas
    ASYNC = "asynchronous"     # Fire and forget

@dataclass
class WriteResult:
    success: bool
    replicas_acked: int
    latency_ms: float

class MultiRegionDatabase:
    """
    Database with configurable replication strategies.
    """
    
    def __init__(
        self,
        local_region: str,
        regions: Dict[str, 'RegionConnection'],
        mode: ReplicationMode = ReplicationMode.ASYNC,
        min_replicas: int = 1
    ):
        self.local_region = local_region
        self.regions = regions
        self.mode = mode
        self.min_replicas = min_replicas  # For semi-sync
    
    async def write(self, key: str, value: str) -> WriteResult:
        """Write with configured replication strategy"""
        start = time.time()
        
        # Always write to local first
        await self.regions[self.local_region].write(key, value)
        
        remote_regions = [r for r in self.regions.values() 
                         if r.name != self.local_region]
        
        if self.mode == ReplicationMode.SYNC:
            # Wait for ALL replicas
            results = await asyncio.gather(
                *[r.write(key, value) for r in remote_regions],
                return_exceptions=True
            )
            acked = sum(1 for r in results if r is True)
            success = acked == len(remote_regions)
            
        elif self.mode == ReplicationMode.SEMI_SYNC:
            # Wait for N replicas
            acked = 0
            pending = [r.write(key, value) for r in remote_regions]
            
            for coro in asyncio.as_completed(pending):
                try:
                    await coro
                    acked += 1
                    if acked >= self.min_replicas:
                        # Don't wait for remaining
                        break
                except Exception:
                    pass
            
            success = acked >= self.min_replicas
            
            # Fire and forget remaining replications
            asyncio.create_task(self._complete_async_replication(pending))
            
        else:  # ASYNC
            # Fire and forget
            for region in remote_regions:
                asyncio.create_task(region.write(key, value))
            acked = 0
            success = True
        
        latency = (time.time() - start) * 1000
        return WriteResult(success, acked + 1, latency)
    
    async def read(
        self, 
        key: str, 
        consistency: str = "local"
    ) -> Optional[str]:
        """
        Read with configurable consistency.
        
        - local: Read from local region (fast, possibly stale)
        - leader: Read from leader region (consistent, slower)
        - quorum: Read from majority (consistent, medium latency)
        """
        if consistency == "local":
            return await self.regions[self.local_region].read(key)
        
        elif consistency == "leader":
            leader = self._get_leader_region()
            return await self.regions[leader].read(key)
        
        elif consistency == "quorum":
            # Read from majority, take latest value
            results = await asyncio.gather(
                *[r.read(key) for r in self.regions.values()],
                return_exceptions=True
            )
            
            valid_results = [r for r in results if not isinstance(r, Exception)]
            if len(valid_results) <= len(self.regions) // 2:
                raise Exception("Quorum not reached")
            
            # Return most recent value (assuming versioned)
            return max(valid_results, key=lambda x: x.version if x else 0)
    
    async def _complete_async_replication(self, pending):
        """Complete remaining async replications"""
        try:
            await asyncio.gather(*pending, return_exceptions=True)
        except Exception:
            pass  # Log and handle retry in production


class CRDTCounter:
    """
    Conflict-free Replicated Data Type for multi-region counters.
    Never loses increments, even with concurrent updates.
    """
    
    def __init__(self, region_id: str):
        self.region_id = region_id
        self.counts: Dict[str, int] = {}  # region_id -> count
    
    def increment(self, amount: int = 1):
        """Local increment - conflict-free"""
        self.counts[self.region_id] = self.counts.get(self.region_id, 0) + amount
    
    def value(self) -> int:
        """Total value across all regions"""
        return sum(self.counts.values())
    
    def merge(self, other: 'CRDTCounter'):
        """Merge with another counter - takes max of each region's count"""
        for region, count in other.counts.items():
            self.counts[region] = max(self.counts.get(region, 0), count)
    
    def to_dict(self) -> Dict[str, int]:
        return self.counts.copy()
    
    @classmethod
    def from_dict(cls, region_id: str, data: Dict[str, int]) -> 'CRDTCounter':
        counter = cls(region_id)
        counter.counts = data.copy()
        return counter


# Example: Global like counter
class GlobalLikeCounter:
    """
    Like counter that works across regions without conflicts.
    """
    
    def __init__(self, region: str, redis_client):
        self.region = region
        self.redis = redis_client
    
    async def like(self, post_id: str, user_id: str):
        """Increment like count and track user like"""
        # Track that user liked (for unlike functionality)
        await self.redis.sadd(f"likes:{post_id}:users", user_id)
        
        # Increment region counter
        await self.redis.hincrby(f"likes:{post_id}:counts", self.region, 1)
    
    async def unlike(self, post_id: str, user_id: str):
        """Decrement if user had liked"""
        removed = await self.redis.srem(f"likes:{post_id}:users", user_id)
        if removed:
            await self.redis.hincrby(f"likes:{post_id}:counts", self.region, -1)
    
    async def get_count(self, post_id: str) -> int:
        """Get total likes across all regions"""
        counts = await self.redis.hgetall(f"likes:{post_id}:counts")
        return sum(int(c) for c in counts.values())
    
    async def sync_with_region(self, post_id: str, remote_counts: Dict[str, int]):
        """Merge counts from another region"""
        local_counts = await self.redis.hgetall(f"likes:{post_id}:counts")
        
        # Take max for each region (CRDT merge)
        for region, count in remote_counts.items():
            local_count = int(local_counts.get(region, 0))
            if count > local_count:
                await self.redis.hset(
                    f"likes:{post_id}:counts", 
                    region, 
                    count
                )

Conflict Resolution Strategies

┌─────────────────────────────────────────────────────────────────┐
│                  CONFLICT RESOLUTION                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  1. LAST-WRITE-WINS (LWW)                                      │
│     • Use timestamp to pick winner                             │
│     • Simple but can lose data                                 │
│     • Good for: User preferences, settings                     │
│                                                                 │
│  2. FIRST-WRITE-WINS                                           │
│     • First value is kept, others discarded                    │
│     • Good for: Unique constraints, reservations               │
│                                                                 │
│  3. MERGE/CRDT                                                  │
│     • Mathematically combine values                            │
│     • Counters: Add together                                   │
│     • Sets: Union                                              │
│     • Good for: Counters, shopping carts                       │
│                                                                 │
│  4. APPLICATION RESOLUTION                                      │
│     • Store all versions                                       │
│     • Let application/user decide                              │
│     • Good for: Documents, complex merges                      │
│                                                                 │
│  5. REGION PRIORITY                                             │
│     • Designate "primary" region for conflicts                 │
│     • Simple, deterministic                                    │
│     • Good for: When one region is authoritative               │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Global Traffic Management

DNS-Based Routing

┌─────────────────────────────────────────────────────────────────┐
│                    DNS GEOLOCATION                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  User in Tokyo queries: api.example.com                        │
│                                                                 │
│        Tokyo User                                               │
│            │                                                    │
│            ▼                                                    │
│      ┌──────────┐                                              │
│      │   DNS    │ "Where is user from?"                        │
│      │ GeoDNS   │ → Tokyo IP detected                          │
│      └────┬─────┘                                              │
│           │                                                     │
│           ▼                                                     │
│      Returns: 13.x.x.x (Tokyo region IP)                       │
│                                                                 │
│  Route53 Health Checks:                                        │
│  • Primary: ap-northeast-1 (Tokyo)                             │
│  • Failover: us-west-2 (Oregon)                                │
│                                                                 │
│  If Tokyo fails → DNS returns Oregon IP                        │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Anycast Routing

┌─────────────────────────────────────────────────────────────────┐
│                       ANYCAST                                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Same IP address announced from multiple locations             │
│  BGP routes to nearest location automatically                  │
│                                                                 │
│      ┌─────────┐    ┌─────────┐    ┌─────────┐                │
│      │ US-EAST │    │ EU-WEST │    │ AP-SOUTH│                │
│      │1.2.3.4  │    │1.2.3.4  │    │1.2.3.4  │                │
│      └─────────┘    └─────────┘    └─────────┘                │
│           ▲              ▲              ▲                       │
│           │              │              │                       │
│      BGP routes traffic to nearest location                    │
│                                                                 │
│  Used by: CDNs, DNS providers, DDoS protection                 │
│                                                                 │
│  Cloudflare: 300+ locations, all with same IPs                 │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Disaster Recovery

RTO and RPO

┌─────────────────────────────────────────────────────────────────┐
│                    RTO vs RPO                                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Disaster occurs here                                           │
│           │                                                     │
│  ─────────●─────────────────────────────────────────►          │
│           │                                                     │
│    ◄──────┼──────►  ◄────────────────────────►                 │
│      RPO  │               RTO                                   │
│  (Data    │         (Time to                                   │
│   Loss)   │          Recover)                                   │
│                                                                 │
│  RPO (Recovery Point Objective):                               │
│  • How much data can you afford to lose?                       │
│  • 0 = No data loss (sync replication)                         │
│  • 1 hour = Last hour of data may be lost                      │
│                                                                 │
│  RTO (Recovery Time Objective):                                │
│  • How long until service is restored?                         │
│  • 0 = Automatic failover (active-active)                      │
│  • 4 hours = Manual intervention OK                            │
│                                                                 │
│  Trade-off: Lower RTO/RPO = Higher cost                        │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Disaster Recovery Tiers

┌─────────────────────────────────────────────────────────────────┐
│                    DR TIERS                                     │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Tier 1: Backup & Restore                                      │
│  ─────────────────────                                          │
│  • Daily/hourly backups to another region                      │
│  • RPO: Hours to days                                          │
│  • RTO: Hours to days                                          │
│  • Cost: $                                                     │
│                                                                 │
│  Tier 2: Pilot Light                                           │
│  ────────────────────                                           │
│  • Core components running (DB replica)                        │
│  • Spin up compute on failover                                 │
│  • RPO: Minutes                                                │
│  • RTO: 10+ minutes                                            │
│  • Cost: $$                                                    │
│                                                                 │
│  Tier 3: Warm Standby                                          │
│  ─────────────────────                                          │
│  • Scaled-down version always running                          │
│  • Scale up on failover                                        │
│  • RPO: Seconds to minutes                                     │
│  • RTO: Minutes                                                │
│  • Cost: $$$                                                   │
│                                                                 │
│  Tier 4: Multi-Site Active-Active                              │
│  ────────────────────────────────                               │
│  • Full production in multiple regions                         │
│  • Automatic failover                                          │
│  • RPO: Near zero                                              │
│  • RTO: Seconds                                                │
│  • Cost: $$$$                                                  │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Failover Implementation

from enum import Enum
from dataclasses import dataclass
from typing import List, Dict, Optional, Callable
import asyncio
import time

class RegionStatus(Enum):
    HEALTHY = "healthy"
    DEGRADED = "degraded"
    UNHEALTHY = "unhealthy"

@dataclass
class HealthCheck:
    region: str
    status: RegionStatus
    latency_ms: float
    last_check: float
    consecutive_failures: int

class GlobalFailoverManager:
    """
    Manages failover between regions based on health checks.
    """
    
    def __init__(
        self,
        regions: List[str],
        primary_region: str,
        health_check_fn: Callable,
        failure_threshold: int = 3,
        check_interval: float = 5.0
    ):
        self.regions = regions
        self.primary_region = primary_region
        self.current_primary = primary_region
        self.health_check_fn = health_check_fn
        self.failure_threshold = failure_threshold
        self.check_interval = check_interval
        
        self.health_status: Dict[str, HealthCheck] = {}
        self.failover_history: List[Dict] = []
        
        # Callbacks
        self.on_failover: Optional[Callable] = None
        self.on_failback: Optional[Callable] = None
    
    async def start_monitoring(self):
        """Start continuous health monitoring"""
        while True:
            await self._check_all_regions()
            await self._evaluate_failover()
            await asyncio.sleep(self.check_interval)
    
    async def _check_all_regions(self):
        """Check health of all regions"""
        tasks = [self._check_region(region) for region in self.regions]
        await asyncio.gather(*tasks, return_exceptions=True)
    
    async def _check_region(self, region: str):
        """Check health of a single region"""
        start = time.time()
        
        try:
            is_healthy = await self.health_check_fn(region)
            latency = (time.time() - start) * 1000
            
            if is_healthy:
                status = RegionStatus.HEALTHY if latency < 100 else RegionStatus.DEGRADED
                consecutive_failures = 0
            else:
                status = RegionStatus.UNHEALTHY
                prev = self.health_status.get(region)
                consecutive_failures = (prev.consecutive_failures + 1) if prev else 1
                
        except Exception as e:
            status = RegionStatus.UNHEALTHY
            latency = -1
            prev = self.health_status.get(region)
            consecutive_failures = (prev.consecutive_failures + 1) if prev else 1
        
        self.health_status[region] = HealthCheck(
            region=region,
            status=status,
            latency_ms=latency,
            last_check=time.time(),
            consecutive_failures=consecutive_failures
        )
    
    async def _evaluate_failover(self):
        """Decide if failover is needed"""
        primary_health = self.health_status.get(self.current_primary)
        
        if not primary_health:
            return
        
        # Check if primary has exceeded failure threshold
        if primary_health.consecutive_failures >= self.failure_threshold:
            # Find best healthy region
            new_primary = self._select_new_primary()
            
            if new_primary and new_primary != self.current_primary:
                await self._perform_failover(new_primary)
        
        # Check if original primary recovered (failback)
        elif (self.current_primary != self.primary_region and
              self.health_status.get(self.primary_region) and
              self.health_status[self.primary_region].status == RegionStatus.HEALTHY):
            
            # Original primary is healthy, consider failback
            await self._perform_failback()
    
    def _select_new_primary(self) -> Optional[str]:
        """Select best available region as new primary"""
        candidates = []
        
        for region, health in self.health_status.items():
            if region == self.current_primary:
                continue
            if health.status in [RegionStatus.HEALTHY, RegionStatus.DEGRADED]:
                candidates.append((region, health.latency_ms))
        
        if not candidates:
            return None
        
        # Select region with lowest latency
        candidates.sort(key=lambda x: x[1])
        return candidates[0][0]
    
    async def _perform_failover(self, new_primary: str):
        """Execute failover to new region"""
        old_primary = self.current_primary
        
        self.failover_history.append({
            "timestamp": time.time(),
            "type": "failover",
            "from": old_primary,
            "to": new_primary
        })
        
        self.current_primary = new_primary
        
        if self.on_failover:
            await self.on_failover(old_primary, new_primary)
        
        print(f"FAILOVER: {old_primary}{new_primary}")
    
    async def _perform_failback(self):
        """Return to original primary region"""
        current = self.current_primary
        
        self.failover_history.append({
            "timestamp": time.time(),
            "type": "failback",
            "from": current,
            "to": self.primary_region
        })
        
        self.current_primary = self.primary_region
        
        if self.on_failback:
            await self.on_failback(current, self.primary_region)
        
        print(f"FAILBACK: {current}{self.primary_region}")
    
    def get_current_primary(self) -> str:
        """Get current active primary region"""
        return self.current_primary
    
    def get_health_summary(self) -> Dict:
        """Get health status of all regions"""
        return {
            region: {
                "status": health.status.value,
                "latency_ms": health.latency_ms,
                "failures": health.consecutive_failures
            }
            for region, health in self.health_status.items()
        }

Data Locality Patterns

User Home Region

class UserHomeRegion:
    """
    Route users to their 'home' region based on their account location.
    Data stays in region for compliance.
    """
    
    def __init__(self, db):
        self.db = db
        self.region_mapping = {
            "US": ["us-east-1", "us-west-2"],
            "EU": ["eu-west-1", "eu-central-1"],
            "APAC": ["ap-northeast-1", "ap-southeast-1"],
        }
    
    async def get_user_region(self, user_id: str) -> str:
        """Get user's home region"""
        user = await self.db.get_user(user_id)
        
        # Determine region from user's country
        country = user.get("country", "US")
        
        if country in ["US", "CA", "MX"]:
            return self.region_mapping["US"][0]
        elif country in ["GB", "DE", "FR", "IT", "ES"]:
            return self.region_mapping["EU"][0]
        else:
            return self.region_mapping["APAC"][0]
    
    async def route_request(self, user_id: str, request):
        """Route request to user's home region"""
        home_region = await self.get_user_region(user_id)
        
        if self._is_local_region(home_region):
            # Process locally
            return await self._handle_locally(request)
        else:
            # Forward to home region
            return await self._forward_to_region(home_region, request)

Interview Tips

What interviewers expect for global architecture:
  1. Trade-offs awareness: “Active-active gives lower latency but introduces conflict complexity”
  2. Consistency models: “We’ll use eventual consistency with CRDT counters for likes”
  3. Failure handling: “If US-EAST fails, DNS routes to US-WEST in 30 seconds”
  4. Data sovereignty: “EU user data stays in EU for GDPR compliance”
  5. Cost awareness: “Multi-region doubles infrastructure cost”

Common Questions

Q: How do you handle a user who travels between regions?
Use sticky sessions or user home region. Route all their data operations to their home region regardless of current location.
Q: What consistency model would you use for a global e-commerce cart?
CRDT-based cart that merges additions. “Add item” operations are commutative. For checkout, route to single region for consistency.
Q: How do you test multi-region failover?
Chaos engineering: regularly kill regions in production. GameDay exercises quarterly. Automated failover tests in staging.