In this hands-on project, you’ll build a complete serverless URL shortener service. This is a common system design interview question and demonstrates key serverless patterns. Think of it as building a mini Bit.ly — a deceptively simple product that touches nearly every serverless building block AWS offers, from API design to data modeling to edge caching.What You’ll Build:
REST API with API Gateway (your front door — all traffic enters here)
Lambda functions for business logic (stateless compute that scales to zero)
DynamoDB for data persistence (single-digit millisecond lookups at any scale)
CloudFront for caching and performance (reduces Lambda invocations by 80%+ for redirects)
Complete observability with CloudWatch and X-Ray (because you cannot improve what you cannot measure)
Skills Demonstrated:
Serverless architecture design
DynamoDB data modeling (single-table design for a real use case)
API design and implementation
Cost optimization (this architecture costs roughly the price of a coffee per month at moderate traffic)
# Table: url-shortener# Partition Key: short_code (String)# No Sort Key neededtable_schema = { "TableName": "url-shortener", "KeySchema": [ # short_code is the partition key -- each short URL maps to exactly one item, # so a simple primary key (no sort key) is the right choice here. {"AttributeName": "short_code", "KeyType": "HASH"} ], "AttributeDefinitions": [ {"AttributeName": "short_code", "AttributeType": "S"} ], # PAY_PER_REQUEST (on-demand) is ideal for a URL shortener because traffic # is unpredictable -- viral links can spike 1000x in minutes. # Cost tip: switch to provisioned mode once traffic stabilizes to save ~80%. "BillingMode": "PAY_PER_REQUEST", "TimeToLiveSpecification": { # TTL lets DynamoDB automatically delete expired URLs at no cost. # Items are removed within 48 hours of expiration (not instant). "AttributeName": "expires_at", "Enabled": True }}# Item structureitem = { "short_code": "abc123", # Partition key -- the 6-char code in the URL "long_url": "https://example.com/very/long/url/path", "created_at": "2024-01-15T10:00:00Z", "expires_at": 1705312800, # TTL epoch -- DynamoDB auto-deletes after this "click_count": 0, # Atomic counter updated on each redirect "user_id": "user_456" # Optional -- enables per-user analytics later}
import jsonimport boto3import hashlibimport timedynamodb = boto3.resource('dynamodb')table = dynamodb.Table('url-shortener')def generate_short_code(url: str) -> str: """Generate 6-character short code from URL hash. Why SHA-256 + timestamp? The timestamp prevents collisions when the same URL is shortened multiple times (each submission gets its own short code). 6 hex chars = 16^6 = ~16.7 million unique codes. For higher volume, increase to 7-8 chars or switch to base62 encoding for shorter URLs. Common mistake: using only the URL without a salt -- identical URLs would always produce the same code, breaking per-user analytics. """ hash_object = hashlib.sha256(f"{url}{time.time()}".encode()) return hash_object.hexdigest()[:6]def lambda_handler(event, context): try: body = json.loads(event['body']) long_url = body['url'] # Validate URL if not long_url.startswith(('http://', 'https://')): return { 'statusCode': 400, 'body': json.dumps({'error': 'Invalid URL'}) } # Generate short code short_code = generate_short_code(long_url) # Store in DynamoDB table.put_item(Item={ 'short_code': short_code, 'long_url': long_url, 'created_at': int(time.time()), 'click_count': 0 }) short_url = f"https://short.ly/{short_code}" return { 'statusCode': 201, 'headers': {'Content-Type': 'application/json'}, 'body': json.dumps({ 'short_url': short_url, 'short_code': short_code }) } except Exception as e: return { 'statusCode': 500, 'body': json.dumps({'error': str(e)}) }
import jsonimport boto3dynamodb = boto3.resource('dynamodb')table = dynamodb.Table('url-shortener')def lambda_handler(event, context): short_code = event['pathParameters']['code'] # Get URL from DynamoDB response = table.get_item(Key={'short_code': short_code}) if 'Item' not in response: return { 'statusCode': 404, 'body': json.dumps({'error': 'URL not found'}) } long_url = response['Item']['long_url'] # Increment click count using atomic ADD operation. # Common mistake: using SET click_count = click_count + 1, which is NOT atomic. # ADD is atomic and handles concurrent updates correctly. # Production tip: for high-traffic URLs, decouple analytics into a Kinesis stream # so the redirect path stays fast (don't let analytics slow down the user). table.update_item( Key={'short_code': short_code}, UpdateExpression='ADD click_count :inc', ExpressionAttributeValues={':inc': 1} ) # Return 301 (permanent redirect) so browsers and CDN cache the mapping. # 301 vs 302: use 301 if the mapping is stable (better for SEO and caching). # Use 302 if you need to track every click (browsers won't cache 302s). # Cache-Control: 300s means CloudFront serves cached redirects for 5 minutes, # reducing Lambda invocations by 80%+ for popular links. return { 'statusCode': 301, 'headers': { 'Location': long_url, 'Cache-Control': 'max-age=300' } }
# Lambda scales automatically# Default: 1000 concurrent executions per region# Reserved concurrency - Guarantee capacity# Provisioned concurrency - Eliminate cold startsconcurrency_config = { "redirect_function": { # Reserved concurrency: guarantees this function can always use 500 slots. # This also CAPS it at 500 -- protecting other functions from being starved. "reserved_concurrency": 500, # Provisioned concurrency: keeps 100 containers warm at all times. # Eliminates cold starts for the first 100 concurrent requests. # Cost tip: at 128 MB memory, 100 provisioned instances cost ~$40/month. # Only use this for latency-critical paths (like the redirect function). "provisioned_concurrency": 100 }}
Key Takeaway: Serverless architectures eliminate server management, scale automatically, and cost effectively for variable workloads. Start simple and add complexity (caching, analytics) as needed.