Content Delivery Networks (CDNs) and edge computing are critical for delivering fast, reliable experiences to users worldwide. Understanding these concepts is essential for designing systems at scale. Here is the core insight: no matter how fast your code runs, physics imposes a hard floor on latency. Light in a fiber optic cable travels at roughly 200,000 km/s, which means a round trip from New York to Singapore takes at minimum ~160ms — and that is before your server does any work. CDNs attack this problem by moving data and compute closer to users, turning 200ms responses into 20ms responses for the vast majority of requests.
Interview Context: Questions about CDNs often come up when designing content-heavy systems (Netflix, YouTube) or discussing latency optimization strategies.
Edge computing represents a fundamental shift: instead of the edge being a “dumb cache” that only stores static files, it becomes a programmable layer that can run your code. This means decisions that used to require a round trip to your origin server — authentication checks, A/B test assignments, geographic routing, personalization — can now happen in 5ms at the edge instead of 200ms at the origin. The trade-off is that edge environments are constrained: limited CPU time (typically 10-50ms), limited memory, no persistent connections, and cold starts can add latency. Design your edge logic to be small, fast, and stateless.
Common CDN Interview Questions and Strong Answer Patterns:
“How would you design a video streaming service?”
Start with the transcoding pipeline (source to multiple resolutions), then discuss HLS/DASH adaptive bitrate streaming, CDN distribution with multi-tier caching, and segment size trade-offs (smaller segments = faster quality switching but more requests; larger segments = fewer requests but slower adaptation).
“How do you handle cache invalidation?”
Lead with versioned URLs for static assets (the cache never needs invalidation because the URL changes), then discuss purge APIs for dynamic content, and stale-while-revalidate as a pattern that preserves availability during revalidation. The key insight: “There are only two hard things in computer science — cache invalidation and naming things.”
“What happens on a cache miss?”
Walk through the full request flow: edge checks local cache, misses, checks regional shield (if present), misses again, goes to origin. Mention the thundering herd problem (1,000 simultaneous requests for the same uncached resource all hit origin) and how request collapsing solves it (one request to origin, response fanned out to all waiters).
“How do you optimize for global users?”
Discuss Anycast DNS for routing to nearest PoP, multi-tier caching to reduce origin load, HTTP/2 or QUIC for connection efficiency, and consider whether the use case needs read-your-writes consistency (which complicates caching) or can tolerate brief staleness.
“How do you secure content at the edge?”
Signed URLs with expiration for access control, token-based authentication at edge workers, WAF rules for injection attacks, and DDoS protection via Anycast distribution that absorbs volumetric attacks across the entire edge network.
Design a Global Image Delivery ServiceRequirements:
Serve 1 billion images per day
Support on-the-fly resizing and format conversion
Sub-100ms latency globally
Cost-effective storage and delivery
Consider:
How would you structure the URL scheme? (Hint: encode dimensions and format in the URL path, e.g., /images/abc123/400x300.webp, so each variant has a unique cacheable URL)
Where would you do image transformations? (Trade-off: at the edge means faster first response but limited compute; at the origin means more powerful processing but higher latency on cache miss. The sweet spot is usually transform-on-first-request-then-cache at the edge, with an origin shield to prevent duplicate work)
How would you handle cache invalidation? (Versioned source URLs mean transformed variants auto-invalidate when the source changes)
What is your strategy for unpopular images? (The long tail problem: 80% of images are rarely accessed. Do not pre-generate all variants. Generate on demand, cache with a shorter TTL, and let the CDN evict cold entries naturally via LRU)
Scalability Analysis: At 1 billion images per day, you are looking at roughly 11,500 image requests per second average, with peaks of 30,000-50,000 QPS. If each image averages 200KB, your daily egress is approximately 200TB. At 0.05/GBforCDNbandwidth,thatis10,000/day in delivery costs alone — which is why cache hit ratio is the single most important metric. Moving from 85% to 95% cache hit ratio cuts your origin bandwidth (and cost) in half.
You are designing a video streaming platform like Netflix that serves 200M subscribers globally. Walk me through the CDN architecture and estimate the bandwidth costs.
Strong Answer:Let me start with the numbers, because they drive every architectural decision.Traffic estimation: 200M subscribers, assume 30% are active daily = 60M DAU. Average session: 2 hours of video. At an average bitrate of 5 Mbps (mix of mobile at 2 Mbps and 4K TVs at 15 Mbps), each user consumes 5 Mbps * 7,200 seconds = 4.5 GB per session. Total daily egress: 60M * 4.5 GB = 270 PB/day. Peak concurrent viewers (assume 10% of DAU): 6M * 5 Mbps = 30 Tbps.CDN architecture:
Tier 1: Embedded caches inside ISPs (Open Connect Appliances, Netflix’s approach). Place physical servers inside the top 100 ISPs worldwide. Each appliance stores the top 5,000 most popular titles pre-loaded overnight during off-peak hours. This handles 90%+ of traffic without crossing the ISP’s peering boundary. This is the single biggest cost optimization — transit costs $0 because the data never leaves the ISP’s network.
Tier 2: Regional PoPs (20-50 locations). For the long-tail content that ISP caches do not have. These serve cache misses from Tier 1 and also handle content for smaller ISPs without embedded appliances.
Tier 3: Origin (2-3 locations). Stores the master copy of all content across all resolutions. Only serves ~1-2% of total traffic (cache misses from Tier 2).
Cost estimation: If you do NOT have ISP-embedded caches and rely entirely on a commercial CDN at 0.02/GB(volumepricing),270PB/day=5.4M/day = 162M/month.ThisiswhyNetflixbuiltOpenConnect−−theiractualcostisestimatedat0.005/GB or less, saving over $120M/month compared to commercial CDN pricing.Follow-up: A new season of your most popular show drops at midnight. 20M users start streaming simultaneously. Walk me through what happens at the CDN layer.This is pre-planned. The content was transcoded into all resolutions/bitrates days in advance and pre-pushed to every ISP cache and regional PoP during the prior 48 hours. At midnight, 20M simultaneous requests hit Tier 1 ISP caches — 95%+ cache hit ratio because the content is already there. The remaining 5% (smaller ISPs, less popular resolutions) hit Tier 2, which was also pre-warmed. Origin sees almost zero traffic. The challenge is not bandwidth but connection concurrency — 20M TCP/QUIC connections establishing in a 60-second window. The adaptive bitrate algorithm starts everyone at 480p for the first 10 seconds, then ramps up quality based on measured bandwidth, preventing a thundering herd of 20M 4K requests hitting the caches simultaneously.
Your e-commerce site serves product images globally. You currently have an 85% CDN cache hit ratio and your CDN bill is $50,000/month. The VP of engineering says to cut it in half. What do you do?
Strong Answer:50,000/monthatroughly0.05/GB means ~1 PB/month of edge bandwidth. With 85% cache hit ratio, the origin is serving 15% of requests = 150 TB/month to refill caches. The cost breakdown is roughly: 42,500foredgebandwidth(servingusers)+7,500 for origin-to-edge bandwidth (cache fills). The lever that matters most is cache hit ratio.Step 1: Improve cache hit ratio from 85% to 95% (biggest impact):
Normalize URLs: UTM parameters, session tokens, and tracking IDs in the URL create unique cache keys for the same image. Strip all non-essential query parameters from the cache key. I have seen this alone improve hit ratio by 5-10%.
Increase TTLs: If product images change infrequently (most do not change after upload), set Cache-Control to 1 year with versioned filenames (product-123-v2.webp). Invalidation happens by changing the filename, not purging the cache.
Add an origin shield: Without it, a cache miss at the Tokyo PoP and a cache miss at the London PoP both independently fetch from origin. With an origin shield (a single intermediate cache), the second miss is served from the shield. This reduces origin fetches by 60-80% for globally distributed traffic.
Step 2: Reduce bytes per request (image optimization):
Serve WebP/AVIF instead of JPEG/PNG. WebP is 25-35% smaller than JPEG at equivalent quality. At 1 PB/month, a 30% reduction saves 300 TB = $15,000/month.
Responsive images: Serve 400px wide images to mobile, 1200px to desktop. Most e-commerce sites serve the same 2000px image to everyone. Serving appropriately sized images saves 40-60% bandwidth on mobile traffic (typically 60% of total).
Quality tuning: Most product images are saved at quality 90-95. Reducing to quality 80-85 is visually imperceptible but reduces file size by 20-30%.
Combined impact: 95% cache hit ratio + 30% smaller images could reduce the bill from 50,000to20,000-25,000/month — meeting the VP’s target.Follow-up: After optimizing, you are at 96% cache hit ratio. The remaining 4% cache misses are from the “long tail” — millions of obscure product images each accessed once a month. How do you handle them?The long tail is inherently uncacheable at the edge — there are too many unique URLs accessed too infrequently. Accept that these will always be cache misses. The optimization is on the origin side: serve long-tail images from S3 directly (cheap storage, high availability) with a thin Lambda@Edge function that does on-the-fly resizing and format conversion. Do not pre-generate resized variants for the long tail — generate on demand, serve, and let the CDN cache it with a short TTL (1 hour). If it is accessed again within that hour, great. If not, the CDN evicts it and the small storage cost is negligible. The key metric shifts from cache hit ratio (diminishing returns past 96%) to origin response time for cache misses (target under 500ms including image transformation).
You need to deploy edge functions that validate JWTs and make authorization decisions for 500K requests/sec globally. Estimate the cost and identify the architectural risks.
Strong Answer:Cost estimation for Cloudflare Workers (the most common edge compute platform):
Cloudflare Workers pricing: 0.50permillionrequests(paidplan).1.3trillion/1M∗0.50 = 650,000/month.Thatisexpensive.Butthebundledplangives10Mrequests/monthincludedwiththe5/month plan, and enterprise pricing with volume commitments drops to roughly 0.15−0.30permillionatthisscale.Realisticcost:200,000-400,000/month.
CPU time: JWT validation takes ~0.5ms. At 0.000012permsofCPUtime(Cloudflare′sdurationcharge),thatis0.000006 per request = $7,800/month for CPU. Negligible compared to the per-request charge.
Comparison: doing this at origin instead: 500K req/sec at 5ms per auth check = 2,500 CPU-seconds per second, requiring roughly 50-80 c6g.2xlarge instances at ~0.27/hr=10,000-15,000/month. Significantly cheaper in raw compute, but you lose the latency benefit (adding 50-150ms round trip from edge to origin for every request).Architectural risks:
Key rotation: JWT validation requires the public key. Edge functions need access to the JWKS (JSON Web Key Set). If you cache JWKS at the edge with a 5-minute TTL and rotate keys, there is a 5-minute window where requests with the new key fail validation at edges that still have the old JWKS cached. Solution: always validate against both current and previous keys during rotation.
Cold starts: Cloudflare Workers have minimal cold starts (~0ms, V8 isolates), but Lambda@Edge has 5-50ms cold starts. At 500K req/sec the cold start rate is near zero (all instances stay warm), but during a traffic dip followed by a spike, cold starts can cause a latency bump.
No database access: Edge functions cannot query your user database to check permissions. You must encode all authorization data in the JWT claims or use an edge KV store (which adds 10-50ms per lookup). Design your auth model to be self-contained in the token.
Follow-up: You discover that 30% of your requests are from bots that are wasting your edge compute budget. How do you handle this?Layer bot detection before JWT validation. First, use the CDN’s built-in bot score (Cloudflare Bot Management, AWS WAF Bot Control) — this runs at the network layer before your edge function executes, so you do not pay compute costs for detected bots. Set a threshold: bot score above 90 gets a 403, score 50-90 gets a JS challenge (which bots fail), below 50 passes through to your edge function. This eliminates 30% of your edge compute costs = $60,000-120,000/month savings. The remaining sophisticated bots that pass the challenge are handled by rate limiting at the edge (token bucket per IP, per API key).