Skip to main content

Introduction

Content Delivery Networks (CDNs) and edge computing are critical for delivering fast, reliable experiences to users worldwide. Understanding these concepts is essential for designing systems at scale.
Interview Context: Questions about CDNs often come up when designing content-heavy systems (Netflix, YouTube) or discussing latency optimization strategies.

CDN Fundamentals

How CDNs Work

┌─────────────────────────────────────────────────────────────┐
│                    CDN Architecture                          │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│   User in Tokyo                     User in New York        │
│        │                                   │                 │
│        ▼                                   ▼                 │
│   ┌─────────┐                         ┌─────────┐           │
│   │  Edge   │                         │  Edge   │           │
│   │  PoP    │                         │  PoP    │           │
│   │ Tokyo   │                         │ New York│           │
│   └────┬────┘                         └────┬────┘           │
│        │                                   │                 │
│        │     Cache Miss?                   │                 │
│        │          │                        │                 │
│        └──────────┼────────────────────────┘                 │
│                   │                                          │
│                   ▼                                          │
│            ┌─────────────┐                                   │
│            │   Origin    │                                   │
│            │   Server    │                                   │
│            │ (US-West)   │                                   │
│            └─────────────┘                                   │
│                                                              │
└─────────────────────────────────────────────────────────────┘

PoP = Point of Presence (edge location)

CDN Request Flow

1. User requests: https://cdn.example.com/video.mp4

2. DNS Resolution:
   cdn.example.com → Anycast IP → Nearest PoP

3. Edge Check:
   ┌─────────────────────────────────────────────┐
   │ Edge Server                                 │
   ├─────────────────────────────────────────────┤
   │ if cache_hit:                               │
   │     return cached_content  (< 50ms)        │
   │ else:                                       │
   │     fetch_from_origin()    (200-500ms)     │
   │     cache_locally()                         │
   │     return content                          │
   └─────────────────────────────────────────────┘

4. Headers Returned:
   X-Cache: HIT (or MISS)
   X-Edge-Location: NRT52 (Tokyo)
   Cache-Control: max-age=86400
   Age: 3600 (seconds since cached)

Caching Strategies

Cache Control Headers

# Cache for 1 day, allow CDN caching
Cache-Control: public, max-age=86400

# Cache for 1 hour, revalidate after
Cache-Control: public, max-age=3600, must-revalidate

# Don't cache at all
Cache-Control: no-store, no-cache

# Private (browser only, not CDN)
Cache-Control: private, max-age=3600

# Stale-while-revalidate (serve stale while fetching fresh)
Cache-Control: max-age=3600, stale-while-revalidate=86400

Cache Key Design

from hashlib import sha256
from urllib.parse import urlparse, parse_qs
from typing import Dict, List, Optional

class CacheKeyGenerator:
    """
    Generate cache keys for CDN.
    
    Key considerations:
    - URL path
    - Query parameters (some, not all)
    - Headers (Accept-Encoding, Accept-Language)
    - Cookies (for personalization)
    - Device type
    """
    
    def __init__(
        self,
        include_query_params: List[str] = None,
        exclude_query_params: List[str] = None,
        vary_headers: List[str] = None
    ):
        self.include_params = include_query_params or []
        self.exclude_params = exclude_query_params or [
            "utm_source", "utm_medium", "utm_campaign",  # Analytics
            "fbclid", "gclid",  # Tracking
            "_", "timestamp"   # Cache busters
        ]
        self.vary_headers = vary_headers or [
            "Accept-Encoding",
            "Accept-Language"
        ]
    
    def generate_key(
        self,
        url: str,
        headers: Dict[str, str] = None,
        cookies: Dict[str, str] = None
    ) -> str:
        """Generate a cache key for the request."""
        parsed = urlparse(url)
        query_params = parse_qs(parsed.query)
        
        # Filter query parameters
        filtered_params = self._filter_query_params(query_params)
        
        # Build key components
        components = [
            parsed.netloc,
            parsed.path,
            self._normalize_params(filtered_params)
        ]
        
        # Add vary header values
        if headers:
            for header in self.vary_headers:
                if header.lower() in (h.lower() for h in headers):
                    components.append(
                        f"{header}={headers.get(header, '')}"
                    )
        
        # Add personalization segment (optional)
        if cookies and "segment" in cookies:
            components.append(f"segment={cookies['segment']}")
        
        # Generate hash
        key_string = ":".join(components)
        return sha256(key_string.encode()).hexdigest()[:32]
    
    def _filter_query_params(
        self, 
        params: Dict[str, List[str]]
    ) -> Dict[str, List[str]]:
        """Filter out excluded parameters."""
        if self.include_params:
            return {
                k: v for k, v in params.items() 
                if k in self.include_params
            }
        return {
            k: v for k, v in params.items() 
            if k not in self.exclude_params
        }
    
    def _normalize_params(
        self, 
        params: Dict[str, List[str]]
    ) -> str:
        """Sort and normalize parameters for consistent keys."""
        sorted_params = sorted(params.items())
        return "&".join(
            f"{k}={','.join(sorted(v))}" 
            for k, v in sorted_params
        )


class CacheTTLPolicy:
    """Determine TTL based on content type."""
    
    DEFAULT_TTL = 3600  # 1 hour
    
    TTL_BY_EXTENSION = {
        # Static assets - long TTL
        ".js": 31536000,    # 1 year
        ".css": 31536000,
        ".woff2": 31536000,
        ".woff": 31536000,
        
        # Images - medium TTL
        ".jpg": 86400,      # 1 day
        ".jpeg": 86400,
        ".png": 86400,
        ".gif": 86400,
        ".webp": 86400,
        
        # Video - long TTL
        ".mp4": 604800,     # 1 week
        ".webm": 604800,
        
        # HTML - short TTL
        ".html": 300,       # 5 minutes
        
        # API responses
        ".json": 60,        # 1 minute
    }
    
    @classmethod
    def get_ttl(cls, path: str, content_type: str = None) -> int:
        """Get appropriate TTL for content."""
        # Check by extension
        for ext, ttl in cls.TTL_BY_EXTENSION.items():
            if path.endswith(ext):
                return ttl
        
        # Check by content type
        if content_type:
            if "image" in content_type:
                return 86400
            if "video" in content_type:
                return 604800
            if "javascript" in content_type or "css" in content_type:
                return 31536000
        
        return cls.DEFAULT_TTL

Cache Invalidation

┌─────────────────────────────────────────────────────────────┐
│               Cache Invalidation Strategies                  │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  1. TTL-Based (Time-to-Live)                                │
│     Cache-Control: max-age=3600                             │
│     • Simple, predictable                                   │
│     • May serve stale content                               │
│                                                              │
│  2. Purge (Immediate)                                       │
│     POST /purge/path/to/content                             │
│     • Instant invalidation                                   │
│     • Requires purge API access                              │
│                                                              │
│  3. Soft Purge (Stale-while-revalidate)                     │
│     Mark as stale, serve while fetching fresh               │
│     • Better availability                                    │
│     • Brief inconsistency                                    │
│                                                              │
│  4. Versioned URLs                                          │
│     /assets/app.abc123.js                                   │
│     • Never invalidate, use new URL                          │
│     • Best for static assets                                 │
│                                                              │
│  5. Tag-Based Invalidation                                  │
│     Surrogate-Key: product-123, category-shoes              │
│     Purge by tag: all pages with product-123                │
│     • Efficient bulk invalidation                            │
│     • Requires CDN support                                   │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Edge Computing

Edge Functions/Workers

┌─────────────────────────────────────────────────────────────┐
│                    Edge Computing                            │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Traditional:                                               │
│  User ───► CDN (static only) ───► Origin Server            │
│                                                              │
│  Edge Computing:                                            │
│  User ───► Edge (compute + cache) ───► Origin (optional)   │
│                                                              │
│  Edge capabilities:                                         │
│  • Run JavaScript/WASM at edge                              │
│  • Modify requests/responses                                │
│  • Access edge KV storage                                    │
│  • Make subrequests                                          │
│  • A/B testing, personalization                             │
│  • Authentication/authorization                              │
│  • API routing                                               │
│                                                              │
└─────────────────────────────────────────────────────────────┘
// Cloudflare Worker example

addEventListener('fetch', event => {
    event.respondWith(handleRequest(event.request));
});

async function handleRequest(request) {
    const url = new URL(request.url);
    
    // 1. A/B Testing
    const variant = getABVariant(request);
    
    // 2. Geographic routing
    const country = request.cf.country;
    const region = getRegion(country);
    
    // 3. Authentication at edge
    const authResult = await validateToken(request);
    if (!authResult.valid) {
        return new Response('Unauthorized', { status: 401 });
    }
    
    // 4. Rate limiting
    const rateLimited = await checkRateLimit(
        request.headers.get('CF-Connecting-IP')
    );
    if (rateLimited) {
        return new Response('Too Many Requests', { status: 429 });
    }
    
    // 5. Modify request before sending to origin
    const modifiedRequest = new Request(request, {
        headers: new Headers({
            ...Object.fromEntries(request.headers),
            'X-AB-Variant': variant,
            'X-User-Region': region,
            'X-User-ID': authResult.userId
        })
    });
    
    // 6. Fetch from origin (or cache)
    const response = await fetch(modifiedRequest);
    
    // 7. Modify response before returning
    const modifiedResponse = new Response(response.body, {
        status: response.status,
        headers: new Headers({
            ...Object.fromEntries(response.headers),
            'X-Edge-Location': request.cf.colo,
            'X-Cache-Status': response.headers.get('CF-Cache-Status')
        })
    });
    
    return modifiedResponse;
}

function getABVariant(request) {
    // Consistent hashing based on user cookie or IP
    const cookie = request.headers.get('Cookie') || '';
    const userId = extractUserId(cookie) || 
                   request.headers.get('CF-Connecting-IP');
    
    // Simple hash for variant selection
    const hash = simpleHash(userId);
    return hash % 2 === 0 ? 'A' : 'B';
}

function getRegion(country) {
    const regions = {
        'US': 'na',
        'CA': 'na',
        'GB': 'eu',
        'DE': 'eu',
        'FR': 'eu',
        'JP': 'apac',
        'SG': 'apac',
        'AU': 'apac'
    };
    return regions[country] || 'default';
}

async function validateToken(request) {
    const authHeader = request.headers.get('Authorization');
    if (!authHeader?.startsWith('Bearer ')) {
        return { valid: false };
    }
    
    const token = authHeader.slice(7);
    
    // Check edge KV for token (or validate JWT)
    const userData = await TOKENS.get(token);
    if (!userData) {
        return { valid: false };
    }
    
    return { valid: true, userId: JSON.parse(userData).userId };
}

async function checkRateLimit(ip) {
    const key = `ratelimit:${ip}`;
    const current = parseInt(await RATELIMITS.get(key) || '0');
    
    if (current >= 100) {  // 100 requests per minute
        return true;
    }
    
    await RATELIMITS.put(key, String(current + 1), {
        expirationTtl: 60
    });
    
    return false;
}

CDN Architecture Patterns

Multi-Tier Caching

┌─────────────────────────────────────────────────────────────┐
│                  Multi-Tier Cache Architecture               │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│   Tier 1: Edge PoPs (200+ locations)                        │
│   ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐            │
│   │Tokyo │ │NYC   │ │London│ │Sydney│ │ ...  │            │
│   │ 10ms │ │ 10ms │ │ 10ms │ │ 10ms │ │      │            │
│   └──┬───┘ └──┬───┘ └──┬───┘ └──┬───┘ └──────┘            │
│      │        │        │        │                          │
│   Tier 2: Regional Shields (5-10 locations)                │
│   ┌─────────────┐  ┌─────────────┐  ┌─────────────┐       │
│   │   US-East   │  │    EU       │  │    APAC     │       │
│   │    50ms     │  │    50ms     │  │    50ms     │       │
│   └──────┬──────┘  └──────┬──────┘  └──────┬──────┘       │
│          │                │                │               │
│   Tier 3: Origin Shield (1-2 locations)                    │
│   ┌─────────────────────────────────────────────┐          │
│   │              Origin Shield                   │          │
│   │               100ms                          │          │
│   └────────────────────┬────────────────────────┘          │
│                        │                                    │
│   Origin Servers                                           │
│   ┌─────────────────────────────────────────────┐          │
│   │         Origin (database, compute)           │          │
│   └─────────────────────────────────────────────┘          │
│                                                              │
│   Benefits:                                                 │
│   • Reduce origin load (cache hit at each tier)            │
│   • Faster cache fills (from nearest tier)                 │
│   • Better availability (tier isolation)                   │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Pull vs Push CDN

Pull CDN (Lazy Loading):
┌─────────────────────────────────────────────────────────────┐
│ 1. User requests content                                    │
│ 2. CDN checks cache → MISS                                  │
│ 3. CDN fetches from origin                                  │
│ 4. CDN caches and returns                                   │
│ 5. Subsequent requests → HIT                                │
│                                                              │
│ Pros: Simple, automatic, no pre-warming needed              │
│ Cons: First request slow (cold cache)                       │
│ Best for: Websites, APIs, unpredictable access patterns     │
└─────────────────────────────────────────────────────────────┘

Push CDN (Pre-Loading):
┌─────────────────────────────────────────────────────────────┐
│ 1. Origin pushes content to CDN                             │
│ 2. CDN distributes to edge locations                        │
│ 3. Users always get cache HIT                               │
│                                                              │
│ Pros: No cold cache, predictable latency                    │
│ Cons: Manual management, storage costs                      │
│ Best for: Video streaming, software downloads               │
└─────────────────────────────────────────────────────────────┘

Video Streaming Architecture

Adaptive Bitrate Streaming

┌─────────────────────────────────────────────────────────────┐
│               Video Delivery Architecture                    │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│   Source Video                                              │
│   ┌──────────────┐                                          │
│   │ master.mp4   │                                          │
│   │   4K/60fps   │                                          │
│   └──────┬───────┘                                          │
│          │                                                   │
│   Transcoding Pipeline                                      │
│   ┌──────▼───────┐                                          │
│   │  Transcoder  │                                          │
│   └──────┬───────┘                                          │
│          │                                                   │
│   ┌──────┴──────────────────────────────────────┐           │
│   │                                             │           │
│   ▼           ▼           ▼           ▼         ▼           │
│ 240p       480p       720p       1080p       4K           │
│ 400kbps   1Mbps      3Mbps      6Mbps      15Mbps        │
│                                                              │
│   Each quality → Segments (2-10 seconds each)               │
│                                                              │
│   Manifest (HLS/DASH)                                       │
│   ┌──────────────────────────────────────────┐              │
│   │ #EXTM3U                                  │              │
│   │ #EXT-X-STREAM-INF:BANDWIDTH=400000       │              │
│   │ 240p/playlist.m3u8                       │              │
│   │ #EXT-X-STREAM-INF:BANDWIDTH=1000000      │              │
│   │ 480p/playlist.m3u8                       │              │
│   │ ...                                       │              │
│   └──────────────────────────────────────────┘              │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Client Adaptive Logic:
┌─────────────────────────────────────────────────────────────┐
│ while playing:                                              │
│   bandwidth = measure_current_bandwidth()                   │
│   buffer_level = get_buffer_level()                         │
│                                                              │
│   if buffer_level < 5s:                                     │
│       switch_to_lower_quality()  # Prevent rebuffering     │
│   elif bandwidth > current_quality * 1.5:                   │
│       switch_to_higher_quality() # Improve experience       │
│                                                              │
│   fetch_next_segment(selected_quality)                      │
└─────────────────────────────────────────────────────────────┘

Performance Optimization

Edge Performance Techniques

1. Connection Optimization:
   • HTTP/2 or HTTP/3 (QUIC)
   • Connection pooling to origin
   • TLS session resumption
   • 0-RTT connections

2. Compression:
   • Brotli compression (better than gzip)
   • WebP/AVIF for images
   • Minification at edge

3. Request Collapsing:
   Multiple simultaneous requests for same resource
   → Single origin request
   → Response distributed to all waiters

4. Predictive Prefetching:
   • Analyze user behavior
   • Pre-warm cache for likely next requests
   • Push resources via HTTP/2 push

5. Image Optimization:
   • On-the-fly resizing
   • Format conversion (WebP/AVIF)
   • Quality adjustment based on network

Latency Breakdown

Total Latency (without CDN):
┌──────────────────────────────────────────────────────────┐
│ DNS (50ms) + TCP (100ms) + TLS (50ms) + TTFB (200ms)    │
│                                                          │
│ Total: ~400ms minimum                                    │
└──────────────────────────────────────────────────────────┘

With CDN (cache hit):
┌──────────────────────────────────────────────────────────┐
│ DNS (5ms) + TCP (10ms) + TLS (5ms) + Edge (5ms)         │
│                                                          │
│ Total: ~25ms                                             │
└──────────────────────────────────────────────────────────┘

Optimization impact:
• Anycast DNS: 50ms → 5ms
• Edge proximity: 100ms TCP → 10ms
• TLS resumption: 50ms → 5ms
• Cache hit: 200ms → 5ms

Security at the Edge

DDoS Protection

┌─────────────────────────────────────────────────────────────┐
│               Edge DDoS Protection Layers                    │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Layer 3/4: Network Level                                   │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ • Anycast distribution (absorb volume)              │   │
│  │ • SYN flood protection                              │   │
│  │ • UDP amplification filtering                        │   │
│  │ • IP reputation blocking                             │   │
│  └─────────────────────────────────────────────────────┘   │
│                         │                                    │
│                         ▼                                    │
│  Layer 7: Application Level                                 │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ • Rate limiting per IP/path                         │   │
│  │ • Bot detection (CAPTCHA, JS challenge)             │   │
│  │ • WAF rules (SQL injection, XSS)                    │   │
│  │ • Behavioral analysis                                │   │
│  └─────────────────────────────────────────────────────┘   │
│                         │                                    │
│                         ▼                                    │
│  Origin Protection                                          │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ • Origin IP hidden behind CDN                        │   │
│  │ • Authentication headers to origin                   │   │
│  │ • Allowlist only CDN IPs                             │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Cost Optimization

CDN Pricing Model

Cost Components:
┌────────────────────────────────────────────────────────────┐
│                                                            │
│  1. Bandwidth (egress from edge)                          │
│     $0.05 - $0.15 per GB (varies by region)               │
│     Committed use discounts available                      │
│                                                            │
│  2. Requests                                               │
│     $0.01 per 10,000 requests (HTTP)                      │
│     Higher for HTTPS                                       │
│                                                            │
│  3. Edge Compute                                           │
│     $0.50 per million invocations                          │
│     Plus duration charges                                  │
│                                                            │
│  4. Storage (origin shield, KV)                           │
│     $0.02 per GB/month                                     │
│                                                            │
└────────────────────────────────────────────────────────────┘

Optimization Strategies:
• Higher cache hit ratio = lower origin bandwidth
• Compress content = lower edge bandwidth
• Consolidate requests = lower request count
• Use tiered caching = fewer origin fetches

Interview Tips

Common CDN Interview Questions:
  1. “How would you design a video streaming service?”
    • Discuss transcoding, ABR, CDN caching, chunk size
  2. “How do you handle cache invalidation?”
    • Versioned URLs, purge APIs, TTL strategies
  3. “What happens on a cache miss?”
    • Request flow, origin shield, thundering herd
  4. “How do you optimize for global users?”
    • PoP distribution, DNS routing, multi-tier caching
  5. “How do you secure content at the edge?”
    • Signed URLs, token auth, WAF, DDoS protection

Practice Problem

Design a Global Image Delivery ServiceRequirements:
  • Serve 1 billion images per day
  • Support on-the-fly resizing and format conversion
  • Sub-100ms latency globally
  • Cost-effective storage and delivery
Consider:
  1. How would you structure the URL scheme?
  2. Where would you do image transformations?
  3. How would you handle cache invalidation?
  4. What’s your strategy for unpopular images?