Module Overview
Estimated Time: 3-4 hours | Difficulty: Intermediate | Prerequisites: Networking, Storage
- CloudFront CDN configuration and optimization
- Origin types and behaviors
- Cache invalidation strategies
- Lambda@Edge and CloudFront Functions
- Global Accelerator for non-HTTP workloads
- Edge security with WAF and Shield
CloudFront Overview
Amazon CloudFront is a global CDN with 450+ edge locations worldwide.Copy
┌────────────────────────────────────────────────────────────────────────┐
│ CloudFront Architecture │
├────────────────────────────────────────────────────────────────────────┤
│ │
│ User in Tokyo User in London │
│ │ │ │
│ │ 20ms │ 15ms │
│ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Edge: Tokyo │ │ Edge: London │ │
│ │ (Cache HIT) │ │ (Cache HIT) │ │
│ └──────────────┘ └──────────────┘ │
│ │
│ Cache MISS? Request goes to Regional Edge Cache, then Origin │
│ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ Regional Edge Cache │ │
│ │ (Larger cache, fewer locations) │ │
│ │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Asia-Pacific│ │ Europe │ │ Americas │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │ │
│ │ Cache MISS │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ ORIGIN │ │
│ │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ S3 Bucket │ │ ALB │ │ Custom │ │ │
│ │ │ (static) │ │ (API) │ │ Origin │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
│ KEY BENEFITS: │
│ • 450+ edge locations globally │
│ • Sub-50ms latency for cached content │
│ • DDoS protection included │
│ • Origin protection (reduces load) │
│ │
└────────────────────────────────────────────────────────────────────────┘
CloudFront Distribution Setup
Origin Types
Copy
┌────────────────────────────────────────────────────────────────────────┐
│ CloudFront Origin Types │
├────────────────────────────────────────────────────────────────────────┤
│ │
│ S3 BUCKET (Static Content) │
│ ───────────────────────────── │
│ • Best for: Images, CSS, JS, videos │
│ • Use Origin Access Control (OAC) - NOT public bucket! │
│ • Optionally restrict to CloudFront only │
│ │
│ ALB/ELB (Dynamic Content) │
│ ───────────────────────────── │
│ • Best for: APIs, dynamic HTML │
│ • Must be public (or use VPC origins) │
│ • Forward headers, cookies, query strings as needed │
│ │
│ CUSTOM ORIGIN (Any HTTP Server) │
│ ───────────────────────────────── │
│ • Best for: On-prem, other clouds │
│ • Supports HTTP/HTTPS │
│ • Set timeouts, keep-alive connections │
│ │
│ MEDIA STORE / MEDIA PACKAGE │
│ ───────────────────────────── │
│ • Best for: Live/VOD video streaming │
│ • HLS, DASH, CMAF support │
│ │
└────────────────────────────────────────────────────────────────────────┘
CloudFront with Terraform
Copy
# S3 bucket for static content
resource "aws_s3_bucket" "website" {
bucket = "my-website-assets-${random_id.suffix.hex}"
}
# Origin Access Control (modern, replaces OAI)
resource "aws_cloudfront_origin_access_control" "website" {
name = "website-oac"
origin_access_control_origin_type = "s3"
signing_behavior = "always"
signing_protocol = "sigv4"
}
# CloudFront Distribution
resource "aws_cloudfront_distribution" "website" {
enabled = true
is_ipv6_enabled = true
default_root_object = "index.html"
price_class = "PriceClass_100" # US, Canada, Europe only
# Aliases (custom domains)
aliases = ["www.example.com", "example.com"]
# S3 Origin (static content)
origin {
domain_name = aws_s3_bucket.website.bucket_regional_domain_name
origin_id = "S3-Website"
origin_access_control_id = aws_cloudfront_origin_access_control.website.id
}
# ALB Origin (API)
origin {
domain_name = aws_lb.api.dns_name
origin_id = "ALB-API"
custom_origin_config {
http_port = 80
https_port = 443
origin_protocol_policy = "https-only"
origin_ssl_protocols = ["TLSv1.2"]
}
# Custom headers to verify origin requests
custom_header {
name = "X-Origin-Verify"
value = var.origin_secret
}
}
# Default behavior (static content from S3)
default_cache_behavior {
allowed_methods = ["GET", "HEAD", "OPTIONS"]
cached_methods = ["GET", "HEAD"]
target_origin_id = "S3-Website"
viewer_protocol_policy = "redirect-to-https"
compress = true
cache_policy_id = aws_cloudfront_cache_policy.static.id
origin_request_policy_id = aws_cloudfront_origin_request_policy.cors.id
# Lambda@Edge for SEO/redirects
lambda_function_association {
event_type = "origin-request"
lambda_arn = aws_lambda_function.edge_redirect.qualified_arn
include_body = false
}
}
# API behavior (forward to ALB)
ordered_cache_behavior {
path_pattern = "/api/*"
allowed_methods = ["DELETE", "GET", "HEAD", "OPTIONS", "PATCH", "POST", "PUT"]
cached_methods = ["GET", "HEAD"]
target_origin_id = "ALB-API"
viewer_protocol_policy = "https-only"
compress = true
# Don't cache API responses (or use short TTL)
cache_policy_id = aws_cloudfront_cache_policy.api.id
origin_request_policy_id = aws_cloudfront_origin_request_policy.api.id
}
# SSL Certificate
viewer_certificate {
acm_certificate_arn = aws_acm_certificate.website.arn
ssl_support_method = "sni-only"
minimum_protocol_version = "TLSv1.2_2021"
}
# Geo restrictions (optional)
restrictions {
geo_restriction {
restriction_type = "none"
}
}
# Custom error responses
custom_error_response {
error_code = 404
response_code = 200
response_page_path = "/index.html"
error_caching_min_ttl = 300
}
# WAF integration
web_acl_id = aws_wafv2_web_acl.cloudfront.arn
tags = {
Name = "website-distribution"
}
}
# Cache Policy for Static Content
resource "aws_cloudfront_cache_policy" "static" {
name = "static-content"
min_ttl = 86400 # 1 day minimum
default_ttl = 604800 # 7 days default
max_ttl = 31536000 # 1 year maximum
parameters_in_cache_key_and_forwarded_to_origin {
cookies_config {
cookie_behavior = "none"
}
headers_config {
header_behavior = "none"
}
query_strings_config {
query_string_behavior = "none"
}
enable_accept_encoding_brotli = true
enable_accept_encoding_gzip = true
}
}
# Cache Policy for API (short cache or no cache)
resource "aws_cloudfront_cache_policy" "api" {
name = "api-cache"
min_ttl = 0
default_ttl = 0 # Don't cache by default
max_ttl = 3600 # Honor Cache-Control up to 1 hour
parameters_in_cache_key_and_forwarded_to_origin {
cookies_config {
cookie_behavior = "all"
}
headers_config {
header_behavior = "whitelist"
headers {
items = ["Authorization", "Accept-Language"]
}
}
query_strings_config {
query_string_behavior = "all"
}
}
}
Cache Optimization
Cache Key Design
Copy
┌────────────────────────────────────────────────────────────────────────┐
│ Cache Key Best Practices │
├────────────────────────────────────────────────────────────────────────┤
│ │
│ DEFAULT CACHE KEY: Protocol + Domain + Path │
│ Example: https://example.com/images/logo.png │
│ │
│ ADDING TO CACHE KEY (reduces cache hit ratio): │
│ ─────────────────────────────────────────────── │
│ • Query strings: ?version=2 → Different cache entry │
│ • Headers: Accept-Language → Varies by language │
│ • Cookies: session_id → DON'T (one per user = no caching!) │
│ │
│ BEST PRACTICES: │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ STATIC ASSETS (/images/*, /css/*, /js/*) │ │
│ │ • No query strings, headers, or cookies in cache key │ │
│ │ • Use versioned URLs: /js/app.v2.3.1.js │ │
│ │ • Long TTL (1 year) │ │
│ │ │ │
│ │ API ENDPOINTS (/api/*) │ │
│ │ • Forward Authorization header │ │
│ │ • Forward relevant query strings │ │
│ │ • Short or no TTL │ │
│ │ │ │
│ │ PERSONALIZED CONTENT │ │
│ │ • Don't cache (TTL=0) │ │
│ │ • Or use Lambda@Edge to personalize at edge │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────────────┘
Cache Invalidation
Copy
import boto3
cloudfront = boto3.client('cloudfront')
def invalidate_paths(distribution_id: str, paths: list):
"""
Invalidate specific paths in CloudFront cache.
Note: First 1,000 invalidation paths/month are free.
Then $0.005 per path.
"""
response = cloudfront.create_invalidation(
DistributionId=distribution_id,
InvalidationBatch={
'Paths': {
'Quantity': len(paths),
'Items': paths # e.g., ['/images/*', '/index.html']
},
'CallerReference': str(time.time())
}
)
return response['Invalidation']['Id']
# Invalidate specific file
invalidate_paths('E1234567890', ['/css/styles.css'])
# Invalidate all (expensive! avoid in production)
invalidate_paths('E1234567890', ['/*'])
# Better: Use versioned URLs
# /js/app.js → /js/app.v2.3.1.js (no invalidation needed)
Edge Computing
Lambda@Edge vs CloudFront Functions
Copy
┌────────────────────────────────────────────────────────────────────────┐
│ Edge Compute Comparison │
├────────────────────────────────────────────────────────────────────────┤
│ │
│ Feature │ CloudFront Functions │ Lambda@Edge │
│ ─────────────────┼──────────────────────┼────────────────────────── │
│ Runtime │ JavaScript only │ Node.js, Python │
│ Execution Time │ < 1 ms │ < 5 sec (viewer) │
│ │ │ < 30 sec (origin) │
│ Memory │ 2 MB │ 128 MB - 10 GB │
│ Network Access │ No │ Yes │
│ File System │ No │ Read-only /tmp │
│ Request Body │ No │ Yes (origin events) │
│ Pricing │ $0.10 per million │ $0.60 per million + │
│ │ │ duration │
│ Deploy Location │ All edge locations │ Regional edge caches │
│ Cold Start │ None │ Possible │
│ │
│ USE CASES: │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ CloudFront Functions: │ │
│ │ • URL rewrites/redirects │ │
│ │ • Header manipulation │ │
│ │ • Cache key normalization │ │
│ │ • Simple A/B testing │ │
│ │ • JWT validation (simple) │ │
│ │ │ │
│ │ Lambda@Edge: │ │
│ │ • Complex authentication │ │
│ │ • Dynamic image resizing │ │
│ │ • Server-side rendering │ │
│ │ • Personalization │ │
│ │ • Bot detection/mitigation │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────────────┘
CloudFront Function Example
Copy
// URL Rewrite: Add index.html for directory requests
function handler(event) {
var request = event.request;
var uri = request.uri;
// Check if URI is missing a file extension
if (uri.endsWith('/')) {
request.uri += 'index.html';
} else if (!uri.includes('.')) {
request.uri += '/index.html';
}
return request;
}
// A/B Testing: Route 20% of traffic to new version
function handler(event) {
var request = event.request;
// Check for existing cookie
var cookies = request.cookies;
var experimentGroup = cookies['experiment-group']
? cookies['experiment-group'].value
: null;
// Assign to group if not already assigned
if (!experimentGroup) {
experimentGroup = Math.random() < 0.2 ? 'B' : 'A';
}
// Route to appropriate origin path
if (experimentGroup === 'B') {
request.uri = '/v2' + request.uri;
}
// Set cookie for consistency
request.cookies['experiment-group'] = { value: experimentGroup };
return request;
}
// Security Headers
function handler(event) {
var response = event.response;
var headers = response.headers;
// Add security headers
headers['strict-transport-security'] = {
value: 'max-age=31536000; includeSubdomains; preload'
};
headers['x-content-type-options'] = { value: 'nosniff' };
headers['x-frame-options'] = { value: 'DENY' };
headers['x-xss-protection'] = { value: '1; mode=block' };
headers['content-security-policy'] = {
value: "default-src 'self'; script-src 'self' 'unsafe-inline'"
};
return response;
}
Lambda@Edge Example
Copy
# Dynamic Image Resizing at Edge
import boto3
from PIL import Image
import io
import base64
def lambda_handler(event, context):
request = event['Records'][0]['cf']['request']
# Parse query parameters
params = request.get('querystring', '')
width = None
height = None
for param in params.split('&'):
if param.startswith('w='):
width = int(param[2:])
elif param.startswith('h='):
height = int(param[2:])
# If no resize requested, pass through
if not width and not height:
return request
# Fetch original image from S3
s3 = boto3.client('s3')
bucket = 'my-images-bucket'
key = request['uri'].lstrip('/')
try:
response = s3.get_object(Bucket=bucket, Key=key)
image_data = response['Body'].read()
# Resize image
img = Image.open(io.BytesIO(image_data))
if width and height:
img = img.resize((width, height), Image.LANCZOS)
elif width:
ratio = width / img.width
img = img.resize((width, int(img.height * ratio)), Image.LANCZOS)
elif height:
ratio = height / img.height
img = img.resize((int(img.width * ratio), height), Image.LANCZOS)
# Convert to bytes
buffer = io.BytesIO()
img.save(buffer, format='WEBP', quality=85)
buffer.seek(0)
# Return resized image
return {
'status': '200',
'statusDescription': 'OK',
'headers': {
'content-type': [{'value': 'image/webp'}],
'cache-control': [{'value': 'public, max-age=31536000'}]
},
'body': base64.b64encode(buffer.read()).decode('utf-8'),
'bodyEncoding': 'base64'
}
except Exception as e:
print(f"Error: {e}")
return request # Fall back to origin
Global Accelerator
For non-HTTP workloads or when you need static IPs.Copy
┌────────────────────────────────────────────────────────────────────────┐
│ Global Accelerator Architecture │
├────────────────────────────────────────────────────────────────────────┤
│ │
│ CloudFront vs Global Accelerator: │
│ ───────────────────────────────── │
│ │
│ CloudFront: │
│ • HTTP/HTTPS only │
│ • Content caching │
│ • Edge processing (Lambda@Edge) │
│ • Dynamic content acceleration │
│ │
│ Global Accelerator: │
│ • TCP/UDP traffic │
│ • No caching (proxy only) │
│ • Static anycast IPs │
│ • Health-based failover │
│ • Gaming, IoT, VoIP │
│ │
│ Architecture: │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Users │ │
│ │ │ │ │
│ │ │ Static Anycast IPs: 1.2.3.4, 5.6.7.8 │ │
│ │ ▼ │ │
│ │ ┌────────────────────────────────────────────────────────┐ │ │
│ │ │ AWS Global Network │ │ │
│ │ │ (Edge to origin over AWS backbone) │ │ │
│ │ └────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ┌───────────────┼───────────────┐ │ │
│ │ ▼ ▼ ▼ │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │Endpoint │ │Endpoint │ │Endpoint │ │ │
│ │ │Group 1 │ │Group 2 │ │Group 3 │ │ │
│ │ │us-east-1│ │eu-west-1│ │ap-north │ │ │
│ │ │ │ │ │ │ │ │ │
│ │ │EC2, ALB │ │EC2, ALB │ │EC2, ALB │ │ │
│ │ │NLB, EIP │ │NLB, EIP │ │NLB, EIP │ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
│ USE CASES: │
│ • Gaming: Low-latency UDP │
│ • VoIP: Real-time audio/video │
│ • IoT: MQTT over TCP │
│ • Static IP requirements │
│ • Multi-region failover │
│ │
└────────────────────────────────────────────────────────────────────────┘
Edge Security
WAF Integration
Copy
# WAFv2 Web ACL for CloudFront
resource "aws_wafv2_web_acl" "cloudfront" {
name = "cloudfront-waf"
scope = "CLOUDFRONT" # Must be CLOUDFRONT for CloudFront
provider = aws.us_east_1 # WAF for CloudFront must be in us-east-1
default_action {
allow {}
}
# AWS Managed Rules - Common threats
rule {
name = "AWSManagedRulesCommonRuleSet"
priority = 1
override_action { none {} }
statement {
managed_rule_group_statement {
name = "AWSManagedRulesCommonRuleSet"
vendor_name = "AWS"
}
}
visibility_config {
sampled_requests_enabled = true
cloudwatch_metrics_enabled = true
metric_name = "CommonRuleSet"
}
}
# Rate limiting
rule {
name = "RateLimitRule"
priority = 2
action { block {} }
statement {
rate_based_statement {
limit = 2000 # requests per 5 minutes per IP
aggregate_key_type = "IP"
}
}
visibility_config {
sampled_requests_enabled = true
cloudwatch_metrics_enabled = true
metric_name = "RateLimitRule"
}
}
# Geo blocking (optional)
rule {
name = "GeoBlockRule"
priority = 3
action { block {} }
statement {
geo_match_statement {
country_codes = ["RU", "CN", "KP"]
}
}
visibility_config {
sampled_requests_enabled = true
cloudwatch_metrics_enabled = true
metric_name = "GeoBlockRule"
}
}
visibility_config {
sampled_requests_enabled = true
cloudwatch_metrics_enabled = true
metric_name = "CloudFrontWAF"
}
}
🎯 Interview Questions
Q1: How do you optimize CloudFront cache hit ratio?
Q1: How do you optimize CloudFront cache hit ratio?
Strategies:
- Normalize cache key:
- Remove unnecessary query strings
- Normalize header values
- Don’t include cookies for static content
- Use versioned URLs:
/js/app.v2.3.1.jsinstead of/js/app.js?v=2.3.1- Allows long TTL without invalidation
- Configure appropriate TTLs:
- Static assets: 1 year
- HTML: 1 hour to 1 day
- API: 0 or short TTL
- Use Origin Shield:
- Additional caching layer
- Reduces origin requests
- Monitor metrics:
- Cache hit ratio in CloudWatch
- Analyze popular request reports
Q2: When would you use Global Accelerator over CloudFront?
Q2: When would you use Global Accelerator over CloudFront?
Use Global Accelerator when:
- TCP/UDP traffic (not HTTP)
- Gaming, IoT, VoIP applications
- Need static anycast IPs
- Need instant failover between regions
- No caching needed
- HTTP/HTTPS traffic
- Need content caching
- Edge compute (Lambda@Edge)
- Static website hosting
- CloudFront for web traffic
- Global Accelerator for WebSocket/gaming
Q3: How do you secure content on CloudFront?
Q3: How do you secure content on CloudFront?
Security layers:
- HTTPS only: Redirect HTTP to HTTPS
- Origin Access Control: S3 only accessible via CloudFront
- Signed URLs/Cookies: Restrict access to authenticated users
- WAF: Block common attacks, rate limiting
- Shield: DDoS protection (Standard free, Advanced paid)
- Field-Level Encryption: Encrypt sensitive form data
- Geo-restrictions: Block/allow by country
Q4: CloudFront Functions vs Lambda@Edge - when to use each?
Q4: CloudFront Functions vs Lambda@Edge - when to use each?
CloudFront Functions:
- Simple, sub-millisecond operations
- URL rewrites, header manipulation
- Cost-effective at high scale
- No network access needed
- Complex logic, >1ms execution
- Need to access external services
- Image processing, personalization
- Request body access needed
- CloudFront Functions: ~$100
- Lambda@Edge: ~$600+
Q5: How do you handle cache invalidation at scale?
Q5: How do you handle cache invalidation at scale?
Best practices:
-
Avoid invalidation: Use versioned URLs
Copy
/js/app.abc123.js (content hash in filename) -
Invalidate smartly:
- Specific paths, not wildcards
- Batch invalidations
- First 1000/month free
-
Short TTL for dynamic content:
- Let cache expire naturally
- Use Cache-Control headers
-
Origin Shield:
- Single point to invalidate
- Reduces invalidation spread time
🧪 Hands-On Lab: Deploy Static Site with CloudFront
1
Create S3 Bucket
Private bucket with static website files
2
Create CloudFront Distribution
S3 origin with Origin Access Control
3
Configure Custom Domain
ACM certificate in us-east-1, Route 53 alias
4
Add CloudFront Function
URL rewrite for SPA routing
5
Enable WAF
AWS managed rules + rate limiting
6
Monitor Performance
CloudWatch metrics, cache hit ratio
Next Module
Messaging & Integration
Master SQS, SNS, EventBridge, and event-driven architectures