Module Overview
Estimated Time: 3-4 hours | Difficulty: Intermediate | Prerequisites: Networking, Storage
- CloudFront CDN configuration and optimization
- Origin types and behaviors
- Cache invalidation strategies
- Lambda@Edge and CloudFront Functions
- Global Accelerator for non-HTTP workloads
- Edge security with WAF and Shield
CloudFront Overview
Amazon CloudFront is a global CDN with 450+ edge locations worldwide.ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CloudFront Architecture β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β User in Tokyo User in London β
β β β β
β β 20ms β 15ms β
β βΌ βΌ β
β ββββββββββββββββ ββββββββββββββββ β
β β Edge: Tokyo β β Edge: London β β
β β (Cache HIT) β β (Cache HIT) β β
β ββββββββββββββββ ββββββββββββββββ β
β β
β Cache MISS? Request goes to Regional Edge Cache, then Origin β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Regional Edge Cache β β
β β (Larger cache, fewer locations) β β
β β β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β
β β β Asia-Pacificβ β Europe β β Americas β β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β β Cache MISS β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β ORIGIN β β
β β β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β
β β β S3 Bucket β β ALB β β Custom β β β
β β β (static) β β (API) β β Origin β β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β KEY BENEFITS: β
β β’ 450+ edge locations globally β
β β’ Sub-50ms latency for cached content β
β β’ DDoS protection included β
β β’ Origin protection (reduces load) β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
CloudFront Distribution Setup
Origin Types
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CloudFront Origin Types β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β S3 BUCKET (Static Content) β
β βββββββββββββββββββββββββββββ β
β β’ Best for: Images, CSS, JS, videos β
β β’ Use Origin Access Control (OAC) - NOT public bucket! β
β β’ Optionally restrict to CloudFront only β
β β
β ALB/ELB (Dynamic Content) β
β βββββββββββββββββββββββββββββ β
β β’ Best for: APIs, dynamic HTML β
β β’ Must be public (or use VPC origins) β
β β’ Forward headers, cookies, query strings as needed β
β β
β CUSTOM ORIGIN (Any HTTP Server) β
β βββββββββββββββββββββββββββββββββ β
β β’ Best for: On-prem, other clouds β
β β’ Supports HTTP/HTTPS β
β β’ Set timeouts, keep-alive connections β
β β
β MEDIA STORE / MEDIA PACKAGE β
β βββββββββββββββββββββββββββββ β
β β’ Best for: Live/VOD video streaming β
β β’ HLS, DASH, CMAF support β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
CloudFront with Terraform
# S3 bucket for static content
resource "aws_s3_bucket" "website" {
bucket = "my-website-assets-${random_id.suffix.hex}"
}
# Origin Access Control (modern, replaces OAI)
resource "aws_cloudfront_origin_access_control" "website" {
name = "website-oac"
origin_access_control_origin_type = "s3"
signing_behavior = "always"
signing_protocol = "sigv4"
}
# CloudFront Distribution
resource "aws_cloudfront_distribution" "website" {
enabled = true
is_ipv6_enabled = true
default_root_object = "index.html"
price_class = "PriceClass_100" # US, Canada, Europe only
# Aliases (custom domains)
aliases = ["www.example.com", "example.com"]
# S3 Origin (static content)
origin {
domain_name = aws_s3_bucket.website.bucket_regional_domain_name
origin_id = "S3-Website"
origin_access_control_id = aws_cloudfront_origin_access_control.website.id
}
# ALB Origin (API)
origin {
domain_name = aws_lb.api.dns_name
origin_id = "ALB-API"
custom_origin_config {
http_port = 80
https_port = 443
origin_protocol_policy = "https-only"
origin_ssl_protocols = ["TLSv1.2"]
}
# Custom headers to verify origin requests
custom_header {
name = "X-Origin-Verify"
value = var.origin_secret
}
}
# Default behavior (static content from S3)
default_cache_behavior {
allowed_methods = ["GET", "HEAD", "OPTIONS"]
cached_methods = ["GET", "HEAD"]
target_origin_id = "S3-Website"
viewer_protocol_policy = "redirect-to-https"
compress = true
cache_policy_id = aws_cloudfront_cache_policy.static.id
origin_request_policy_id = aws_cloudfront_origin_request_policy.cors.id
# Lambda@Edge for SEO/redirects
lambda_function_association {
event_type = "origin-request"
lambda_arn = aws_lambda_function.edge_redirect.qualified_arn
include_body = false
}
}
# API behavior (forward to ALB)
ordered_cache_behavior {
path_pattern = "/api/*"
allowed_methods = ["DELETE", "GET", "HEAD", "OPTIONS", "PATCH", "POST", "PUT"]
cached_methods = ["GET", "HEAD"]
target_origin_id = "ALB-API"
viewer_protocol_policy = "https-only"
compress = true
# Don't cache API responses (or use short TTL)
cache_policy_id = aws_cloudfront_cache_policy.api.id
origin_request_policy_id = aws_cloudfront_origin_request_policy.api.id
}
# SSL Certificate
viewer_certificate {
acm_certificate_arn = aws_acm_certificate.website.arn
ssl_support_method = "sni-only"
minimum_protocol_version = "TLSv1.2_2021"
}
# Geo restrictions (optional)
restrictions {
geo_restriction {
restriction_type = "none"
}
}
# Custom error responses
custom_error_response {
error_code = 404
response_code = 200
response_page_path = "/index.html"
error_caching_min_ttl = 300
}
# WAF integration
web_acl_id = aws_wafv2_web_acl.cloudfront.arn
tags = {
Name = "website-distribution"
}
}
# Cache Policy for Static Content
resource "aws_cloudfront_cache_policy" "static" {
name = "static-content"
min_ttl = 86400 # 1 day minimum
default_ttl = 604800 # 7 days default
max_ttl = 31536000 # 1 year maximum
parameters_in_cache_key_and_forwarded_to_origin {
cookies_config {
cookie_behavior = "none"
}
headers_config {
header_behavior = "none"
}
query_strings_config {
query_string_behavior = "none"
}
enable_accept_encoding_brotli = true
enable_accept_encoding_gzip = true
}
}
# Cache Policy for API (short cache or no cache)
resource "aws_cloudfront_cache_policy" "api" {
name = "api-cache"
min_ttl = 0
default_ttl = 0 # Don't cache by default
max_ttl = 3600 # Honor Cache-Control up to 1 hour
parameters_in_cache_key_and_forwarded_to_origin {
cookies_config {
cookie_behavior = "all"
}
headers_config {
header_behavior = "whitelist"
headers {
items = ["Authorization", "Accept-Language"]
}
}
query_strings_config {
query_string_behavior = "all"
}
}
}
Cache Optimization
Cache Key Design
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Cache Key Best Practices β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β DEFAULT CACHE KEY: Protocol + Domain + Path β
β Example: https://example.com/images/logo.png β
β β
β ADDING TO CACHE KEY (reduces cache hit ratio): β
β βββββββββββββββββββββββββββββββββββββββββββββββ β
β β’ Query strings: ?version=2 β Different cache entry β
β β’ Headers: Accept-Language β Varies by language β
β β’ Cookies: session_id β DON'T (one per user = no caching!) β
β β
β BEST PRACTICES: β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β β
β β STATIC ASSETS (/images/*, /css/*, /js/*) β β
β β β’ No query strings, headers, or cookies in cache key β β
β β β’ Use versioned URLs: /js/app.v2.3.1.js β β
β β β’ Long TTL (1 year) β β
β β β β
β β API ENDPOINTS (/api/*) β β
β β β’ Forward Authorization header β β
β β β’ Forward relevant query strings β β
β β β’ Short or no TTL β β
β β β β
β β PERSONALIZED CONTENT β β
β β β’ Don't cache (TTL=0) β β
β β β’ Or use Lambda@Edge to personalize at edge β β
β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Cache Invalidation
import boto3
cloudfront = boto3.client('cloudfront')
def invalidate_paths(distribution_id: str, paths: list):
"""
Invalidate specific paths in CloudFront cache.
Note: First 1,000 invalidation paths/month are free.
Then $0.005 per path.
"""
response = cloudfront.create_invalidation(
DistributionId=distribution_id,
InvalidationBatch={
'Paths': {
'Quantity': len(paths),
'Items': paths # e.g., ['/images/*', '/index.html']
},
'CallerReference': str(time.time())
}
)
return response['Invalidation']['Id']
# Invalidate specific file
invalidate_paths('E1234567890', ['/css/styles.css'])
# Invalidate all (expensive! avoid in production)
invalidate_paths('E1234567890', ['/*'])
# Better: Use versioned URLs
# /js/app.js β /js/app.v2.3.1.js (no invalidation needed)
Edge Computing
Lambda@Edge vs CloudFront Functions
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Edge Compute Comparison β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Feature β CloudFront Functions β Lambda@Edge β
β ββββββββββββββββββΌβββββββββββββββββββββββΌββββββββββββββββββββββββββ β
β Runtime β JavaScript only β Node.js, Python β
β Execution Time β < 1 ms β < 5 sec (viewer) β
β β β < 30 sec (origin) β
β Memory β 2 MB β 128 MB - 10 GB β
β Network Access β No β Yes β
β File System β No β Read-only /tmp β
β Request Body β No β Yes (origin events) β
β Pricing β $0.10 per million β $0.60 per million + β
β β β duration β
β Deploy Location β All edge locations β Regional edge caches β
β Cold Start β None β Possible β
β β
β USE CASES: β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β CloudFront Functions: β β
β β β’ URL rewrites/redirects β β
β β β’ Header manipulation β β
β β β’ Cache key normalization β β
β β β’ Simple A/B testing β β
β β β’ JWT validation (simple) β β
β β β β
β β Lambda@Edge: β β
β β β’ Complex authentication β β
β β β’ Dynamic image resizing β β
β β β’ Server-side rendering β β
β β β’ Personalization β β
β β β’ Bot detection/mitigation β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
CloudFront Function Example
// URL Rewrite: Add index.html for directory requests
function handler(event) {
var request = event.request;
var uri = request.uri;
// Check if URI is missing a file extension
if (uri.endsWith('/')) {
request.uri += 'index.html';
} else if (!uri.includes('.')) {
request.uri += '/index.html';
}
return request;
}
// A/B Testing: Route 20% of traffic to new version
function handler(event) {
var request = event.request;
// Check for existing cookie
var cookies = request.cookies;
var experimentGroup = cookies['experiment-group']
? cookies['experiment-group'].value
: null;
// Assign to group if not already assigned
if (!experimentGroup) {
experimentGroup = Math.random() < 0.2 ? 'B' : 'A';
}
// Route to appropriate origin path
if (experimentGroup === 'B') {
request.uri = '/v2' + request.uri;
}
// Set cookie for consistency
request.cookies['experiment-group'] = { value: experimentGroup };
return request;
}
// Security Headers
function handler(event) {
var response = event.response;
var headers = response.headers;
// Add security headers
headers['strict-transport-security'] = {
value: 'max-age=31536000; includeSubdomains; preload'
};
headers['x-content-type-options'] = { value: 'nosniff' };
headers['x-frame-options'] = { value: 'DENY' };
headers['x-xss-protection'] = { value: '1; mode=block' };
headers['content-security-policy'] = {
value: "default-src 'self'; script-src 'self' 'unsafe-inline'"
};
return response;
}
Lambda@Edge Example
# Dynamic Image Resizing at Edge
import boto3
from PIL import Image
import io
import base64
def lambda_handler(event, context):
request = event['Records'][0]['cf']['request']
# Parse query parameters
params = request.get('querystring', '')
width = None
height = None
for param in params.split('&'):
if param.startswith('w='):
width = int(param[2:])
elif param.startswith('h='):
height = int(param[2:])
# If no resize requested, pass through
if not width and not height:
return request
# Fetch original image from S3
s3 = boto3.client('s3')
bucket = 'my-images-bucket'
key = request['uri'].lstrip('/')
try:
response = s3.get_object(Bucket=bucket, Key=key)
image_data = response['Body'].read()
# Resize image
img = Image.open(io.BytesIO(image_data))
if width and height:
img = img.resize((width, height), Image.LANCZOS)
elif width:
ratio = width / img.width
img = img.resize((width, int(img.height * ratio)), Image.LANCZOS)
elif height:
ratio = height / img.height
img = img.resize((int(img.width * ratio), height), Image.LANCZOS)
# Convert to bytes
buffer = io.BytesIO()
img.save(buffer, format='WEBP', quality=85)
buffer.seek(0)
# Return resized image
return {
'status': '200',
'statusDescription': 'OK',
'headers': {
'content-type': [{'value': 'image/webp'}],
'cache-control': [{'value': 'public, max-age=31536000'}]
},
'body': base64.b64encode(buffer.read()).decode('utf-8'),
'bodyEncoding': 'base64'
}
except Exception as e:
print(f"Error: {e}")
return request # Fall back to origin
Global Accelerator
For non-HTTP workloads or when you need static IPs.ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Global Accelerator Architecture β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β CloudFront vs Global Accelerator: β
β βββββββββββββββββββββββββββββββββ β
β β
β CloudFront: β
β β’ HTTP/HTTPS only β
β β’ Content caching β
β β’ Edge processing (Lambda@Edge) β
β β’ Dynamic content acceleration β
β β
β Global Accelerator: β
β β’ TCP/UDP traffic β
β β’ No caching (proxy only) β
β β’ Static anycast IPs β
β β’ Health-based failover β
β β’ Gaming, IoT, VoIP β
β β
β Architecture: β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β β
β β Users β β
β β β β β
β β β Static Anycast IPs: 1.2.3.4, 5.6.7.8 β β
β β βΌ β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β AWS Global Network β β β
β β β (Edge to origin over AWS backbone) β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β β
β β βββββββββββββββββΌββββββββββββββββ β β
β β βΌ βΌ βΌ β β
β β βββββββββββ βββββββββββ βββββββββββ β β
β β βEndpoint β βEndpoint β βEndpoint β β β
β β βGroup 1 β βGroup 2 β βGroup 3 β β β
β β βus-east-1β βeu-west-1β βap-north β β β
β β β β β β β β β β
β β βEC2, ALB β βEC2, ALB β βEC2, ALB β β β
β β βNLB, EIP β βNLB, EIP β βNLB, EIP β β β
β β βββββββββββ βββββββββββ βββββββββββ β β
β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β USE CASES: β
β β’ Gaming: Low-latency UDP β
β β’ VoIP: Real-time audio/video β
β β’ IoT: MQTT over TCP β
β β’ Static IP requirements β
β β’ Multi-region failover β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Edge Security
WAF Integration
# WAFv2 Web ACL for CloudFront
resource "aws_wafv2_web_acl" "cloudfront" {
name = "cloudfront-waf"
scope = "CLOUDFRONT" # Must be CLOUDFRONT for CloudFront
provider = aws.us_east_1 # WAF for CloudFront must be in us-east-1
default_action {
allow {}
}
# AWS Managed Rules - Common threats
rule {
name = "AWSManagedRulesCommonRuleSet"
priority = 1
override_action { none {} }
statement {
managed_rule_group_statement {
name = "AWSManagedRulesCommonRuleSet"
vendor_name = "AWS"
}
}
visibility_config {
sampled_requests_enabled = true
cloudwatch_metrics_enabled = true
metric_name = "CommonRuleSet"
}
}
# Rate limiting
rule {
name = "RateLimitRule"
priority = 2
action { block {} }
statement {
rate_based_statement {
limit = 2000 # requests per 5 minutes per IP
aggregate_key_type = "IP"
}
}
visibility_config {
sampled_requests_enabled = true
cloudwatch_metrics_enabled = true
metric_name = "RateLimitRule"
}
}
# Geo blocking (optional)
rule {
name = "GeoBlockRule"
priority = 3
action { block {} }
statement {
geo_match_statement {
country_codes = ["RU", "CN", "KP"]
}
}
visibility_config {
sampled_requests_enabled = true
cloudwatch_metrics_enabled = true
metric_name = "GeoBlockRule"
}
}
visibility_config {
sampled_requests_enabled = true
cloudwatch_metrics_enabled = true
metric_name = "CloudFrontWAF"
}
}
π― Interview Questions
Q1: How do you optimize CloudFront cache hit ratio?
Q1: How do you optimize CloudFront cache hit ratio?
Strategies:
- Normalize cache key:
- Remove unnecessary query strings
- Normalize header values
- Donβt include cookies for static content
- Use versioned URLs:
/js/app.v2.3.1.jsinstead of/js/app.js?v=2.3.1- Allows long TTL without invalidation
- Configure appropriate TTLs:
- Static assets: 1 year
- HTML: 1 hour to 1 day
- API: 0 or short TTL
- Use Origin Shield:
- Additional caching layer
- Reduces origin requests
- Monitor metrics:
- Cache hit ratio in CloudWatch
- Analyze popular request reports
Q2: When would you use Global Accelerator over CloudFront?
Q2: When would you use Global Accelerator over CloudFront?
Use Global Accelerator when:
- TCP/UDP traffic (not HTTP)
- Gaming, IoT, VoIP applications
- Need static anycast IPs
- Need instant failover between regions
- No caching needed
- HTTP/HTTPS traffic
- Need content caching
- Edge compute (Lambda@Edge)
- Static website hosting
- CloudFront for web traffic
- Global Accelerator for WebSocket/gaming
Q3: How do you secure content on CloudFront?
Q3: How do you secure content on CloudFront?
Security layers:
- HTTPS only: Redirect HTTP to HTTPS
- Origin Access Control: S3 only accessible via CloudFront
- Signed URLs/Cookies: Restrict access to authenticated users
- WAF: Block common attacks, rate limiting
- Shield: DDoS protection (Standard free, Advanced paid)
- Field-Level Encryption: Encrypt sensitive form data
- Geo-restrictions: Block/allow by country
Q4: CloudFront Functions vs Lambda@Edge - when to use each?
Q4: CloudFront Functions vs Lambda@Edge - when to use each?
CloudFront Functions:
- Simple, sub-millisecond operations
- URL rewrites, header manipulation
- Cost-effective at high scale
- No network access needed
- Complex logic, >1ms execution
- Need to access external services
- Image processing, personalization
- Request body access needed
- CloudFront Functions: ~$100
- Lambda@Edge: ~$600+
Q5: How do you handle cache invalidation at scale?
Q5: How do you handle cache invalidation at scale?
Best practices:
-
Avoid invalidation: Use versioned URLs
/js/app.abc123.js (content hash in filename) -
Invalidate smartly:
- Specific paths, not wildcards
- Batch invalidations
- First 1000/month free
-
Short TTL for dynamic content:
- Let cache expire naturally
- Use Cache-Control headers
-
Origin Shield:
- Single point to invalidate
- Reduces invalidation spread time
π§ͺ Hands-On Lab: Deploy Static Site with CloudFront
Next Module
Messaging & Integration
Master SQS, SNS, EventBridge, and event-driven architectures