Documentation Index Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Module Overview
Estimated Time : 4-5 hours | Difficulty : Intermediate | Prerequisites : Core Concepts
This module covers all AWS compute services in depth. The core decision you will face on every project: how much operational overhead are you willing to trade for how much control? EC2 gives you the keys to the entire machine but you own every patch and scaling decision. Lambda takes all of that away but limits you to 15-minute executions and specific runtimes. ECS and EKS sit in the middle. You’ll learn when to use each service, how to optimize for cost and performance, and real-world architecture patterns.
What You’ll Learn:
EC2 instance types, AMIs, and advanced configurations
Lambda functions for serverless computing
Container orchestration with ECS and EKS
Auto Scaling strategies for elasticity
Cost optimization techniques for compute
Compute Service Selection Guide
Choose the right compute service for your workload:
┌──────────────────────────────────────────────────────────────────────┐
│ AWS Compute Decision Tree │
├──────────────────────────────────────────────────────────────────────┤
│ │
│ What type of workload? │
│ │ │
│ ├─── Short-lived, event-driven ──────► Lambda (Serverless) │
│ │ (< 15 min, stateless) │
│ │ │
│ ├─── Containers needed ─────┬────────► ECS (AWS Native) │
│ │ │ │
│ │ └────────► EKS (Kubernetes) │
│ │ │
│ ├─── Full control needed ────────────► EC2 (Virtual Servers) │
│ │ (OS, networking, GPUs) │
│ │ │
│ └─── Simple web app ─────────────────► Elastic Beanstalk │
│ (PaaS) or App Runner │
│ │
│ Control vs Simplicity Spectrum: │
│ ────────────────────────────── │
│ More Control ◄────────────────────────────────────► Less Management │
│ EC2 │ ECS/EKS │ Fargate │ Lambda │ App Runner │
│ │
└──────────────────────────────────────────────────────────────────────┘
EC2 (Elastic Compute Cloud)
Virtual servers in the cloud. The most fundamental and flexible AWS compute service. EC2 is the “you can build anything” option — it gives you a full virtual machine with your choice of OS, and you can install any software that runs on Linux or Windows. The trade-off is that you are responsible for patching, scaling, and availability. Despite the rise of serverless, EC2 still runs the majority of production workloads on AWS because many applications need persistent state, specific kernel configurations, or GPU hardware that higher-level services cannot provide.
Instance Type Deep Dive
AWS offers 500+ instance types optimized for different workloads:
┌──────────────────────────────────────────────────────────────────────┐
│ EC2 Instance Families │
├──────────────────────────────────────────────────────────────────────┤
│ │
│ GENERAL PURPOSE (M, T) │
│ ────────────────────── │
│ M-series: Balanced compute, memory, networking │
│ • m5.large 2 vCPU, 8 GB → Web servers, small databases │
│ • m5.xlarge 4 vCPU, 16 GB → Application servers │
│ • m5.4xlarge 16 vCPU, 64 GB → Medium workloads │
│ • m7g.* Graviton3 (ARM) → 40% better price/performance │
│ │
│ T-series: Burstable performance (for variable workloads) │
│ • t3.micro 2 vCPU, 1 GB → Free tier, dev/test │
│ • t3.medium 2 vCPU, 4 GB → Light production │
│ • t3.xlarge 4 vCPU, 16 GB → Moderate workloads │
│ │
│ COMPUTE OPTIMIZED (C) │
│ ───────────────────── │
│ High CPU-to-memory ratio for compute-intensive tasks │
│ • c5.large 2 vCPU, 4 GB → Batch processing │
│ • c5.4xlarge 16 vCPU, 32 GB → Scientific computing │
│ • c7g.* Graviton3 → Best compute price/perf │
│ │
│ MEMORY OPTIMIZED (R, X) │
│ ─────────────────────── │
│ High memory-to-CPU ratio for in-memory workloads │
│ • r5.large 2 vCPU, 16 GB → Caching, in-memory DB │
│ • r5.4xlarge 16 vCPU,128 GB → SAP HANA, Redis │
│ • x1e.xlarge 4 vCPU,122 GB → Extreme memory │
│ │
│ STORAGE OPTIMIZED (I, D) │
│ ──────────────────────── │
│ High sequential read/write access to large datasets │
│ • i3.large 2 vCPU, 15 GB, 475 GB NVMe → Databases │
│ • d2.xlarge 4 vCPU, 31 GB, 6 TB HDD → Data warehousing │
│ │
│ ACCELERATED COMPUTING (P, G, Inf) │
│ ───────────────────────────────── │
│ GPU and custom hardware for ML/graphics │
│ • p4d.24xlarge 8x A100 GPUs → ML training │
│ • g4dn.xlarge 1x T4 GPU → ML inference, graphics │
│ • inf1.xlarge 4x Inferentia→ Cost-effective inference │
│ │
└──────────────────────────────────────────────────────────────────────┘
Instance Naming Convention
# Decoding instance type names
def decode_instance_type ( instance_type : str ):
"""
Example: m5dn.2xlarge
m = Instance family (General Purpose)
5 = Generation (5th gen, higher = newer)
d = Additional capability (NVMe SSD)
n = Network optimized
2xlarge = Size (vCPUs and memory)
Size progression:
nano → micro → small → medium → large → xlarge → 2xlarge → ... → metal
"""
families = {
'm' : 'General Purpose' ,
't' : 'Burstable' ,
'c' : 'Compute Optimized' ,
'r' : 'Memory Optimized' ,
'x' : 'Memory Optimized (Extreme)' ,
'i' : 'Storage Optimized (NVMe)' ,
'd' : 'Storage Optimized (Dense)' ,
'p' : 'GPU (Training)' ,
'g' : 'GPU (Graphics/Inference)' ,
}
modifiers = {
'a' : 'AMD processor' ,
'g' : 'AWS Graviton (ARM)' ,
'd' : 'NVMe SSD storage' ,
'n' : 'Network optimized' ,
'e' : 'Extended memory' ,
'z' : 'High frequency' ,
}
return families.get(instance_type[ 0 ], 'Unknown' )
Pro Tip : Use Graviton (ARM) instances (m7g, c7g, r7g) for 40% better price/performance on compatible workloads. Most applications (Python, Node.js, Go, Java, containerized apps) work without modification. The main exceptions are software with x86 assembly optimizations or Windows-only binaries. If your application runs in a container, testing on Graviton is as simple as building a multi-arch image.
T-Series Burstable Instances
T-series instances use CPU credits for burstable performance:
┌────────────────────────────────────────────────────────────────────┐
│ T3 CPU Credit System │
├────────────────────────────────────────────────────────────────────┤
│ │
│ Baseline Performance: │
│ • t3.micro: 10% CPU baseline (earns 6 credits/hour) │
│ • t3.small: 20% CPU baseline (earns 12 credits/hour) │
│ • t3.medium: 20% CPU baseline (earns 24 credits/hour) │
│ • t3.large: 30% CPU baseline (earns 36 credits/hour) │
│ │
│ Credit Usage: │
│ • 1 credit = 1 vCPU at 100% for 1 minute │
│ • Below baseline: Earn credits │
│ • Above baseline: Spend credits │
│ • Credits expire after 24 hours │
│ • Max credit balance: varies by instance size │
│ │
│ CPU Usage Graph: │
│ 100% ┤ ████ │
│ 80% ┤ █ █ Burst period │
│ 60% ┤ █ █ (spending credits) │
│ 40% ┤ █ █ │
│ 20% ┤──────█──────────█────── ← Baseline │
│ 0% ┤ │
│ └─────────────────────────► Time │
│ │
│ Modes: │
│ • Standard: Can burst only with credits │
│ • Unlimited: Can burst beyond credits (pay extra) │
│ │
└────────────────────────────────────────────────────────────────────┘
AMI (Amazon Machine Image)
AMIs are templates containing OS, application server, and applications.
# AMI Selection Best Practices
ami_best_practices = {
"use_aws_provided" : [
"Amazon Linux 2023" , # AWS optimized, free
"Ubuntu 22.04 LTS" , # Popular, well-supported
"Windows Server 2022" , # For .NET workloads
],
"create_custom_ami_when" : [
"Need pre-installed software" ,
"Custom security hardening" ,
"Faster instance boot time" ,
"Consistent deployments" ,
],
"golden_ami_pipeline" : """
Base AMI → Install packages → Configure → Test → Create AMI → Share
Automate with:
- EC2 Image Builder (AWS native)
- Packer (HashiCorp)
""" ,
}
# Launch EC2 with specific AMI
import boto3
ec2 = boto3.client( 'ec2' )
response = ec2.run_instances(
ImageId = 'ami-0c55b159cbfafe1f0' , # Amazon Linux 2023
InstanceType = 't3.medium' ,
MinCount = 1 ,
MaxCount = 1 ,
KeyName = 'my-key-pair' ,
SecurityGroupIds = [ 'sg-0123456789abcdef0' ],
SubnetId = 'subnet-0123456789abcdef0' ,
TagSpecifications = [
{
'ResourceType' : 'instance' ,
'Tags' : [
{ 'Key' : 'Name' , 'Value' : 'WebServer' },
{ 'Key' : 'Environment' , 'Value' : 'Production' },
]
}
],
UserData = '''#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
echo "<h1>Hello from $(hostname)</h1>" > /var/www/html/index.html
'''
)
Access instance info from within the instance:
# IMDSv2 (recommended - more secure)
TOKEN = $( curl -X PUT "http://169.254.169.254/latest/api/token" \
-H "X-aws-ec2-metadata-token-ttl-seconds: 21600" )
# Get instance metadata
curl -H "X-aws-ec2-metadata-token: $TOKEN " \
http://169.254.169.254/latest/meta-data/instance-id
curl -H "X-aws-ec2-metadata-token: $TOKEN " \
http://169.254.169.254/latest/meta-data/public-ipv4
curl -H "X-aws-ec2-metadata-token: $TOKEN " \
http://169.254.169.254/latest/meta-data/iam/security-credentials/MyRole
# Python - Using requests with IMDSv2
import requests
def get_instance_metadata ( path : str ) -> str :
"""Get EC2 instance metadata using IMDSv2."""
# Get token
token_response = requests.put(
"http://169.254.169.254/latest/api/token" ,
headers = { "X-aws-ec2-metadata-token-ttl-seconds" : "21600" }
)
token = token_response.text
# Get metadata
response = requests.get(
f "http://169.254.169.254/latest/meta-data/ { path } " ,
headers = { "X-aws-ec2-metadata-token" : token}
)
return response.text
# Usage
instance_id = get_instance_metadata( "instance-id" )
public_ip = get_instance_metadata( "public-ipv4" )
az = get_instance_metadata( "placement/availability-zone" )
Lambda (Serverless Functions)
Run code without provisioning servers. Pay only for compute time used. Lambda is the opposite end of the spectrum from EC2: you give up control of the operating system, runtime environment, and scaling decisions, and in return AWS handles all of that for you. The result is that your infrastructure cost drops to zero when nobody is using your application — something that is impossible with EC2 instances that charge by the hour whether they are serving traffic or sitting idle.
Lambda Architecture
┌────────────────────────────────────────────────────────────────────────┐
│ Lambda Execution Model │
├────────────────────────────────────────────────────────────────────────┤
│ │
│ Request arrives │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ Lambda Service │ │
│ │ │ │
│ │ Is there a warm container? │ │
│ │ │ │ │
│ │ ├── YES ──► Reuse container (warm start: ~1ms) │ │
│ │ │ │ │
│ │ └── NO ───► Create container (cold start: 100ms-10s) │ │
│ │ │ │ │
│ │ ├── Download code from S3 │ │
│ │ ├── Start runtime (Python, Node, etc.) │ │
│ │ ├── Run initialization code │ │
│ │ └── Execute handler │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
│ Cold Start Factors: │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ Runtime │ Cold Start │ Notes │ │
│ │─────────────────────────────────────────────────────────────────│ │
│ │ Python │ 100-300ms │ Fast, great for most use cases │ │
│ │ Node.js │ 100-300ms │ Fast, good for APIs │ │
│ │ Go │ 50-100ms │ Fastest cold starts │ │
│ │ Java │ 3-10s │ Slow, use SnapStart or GraalVM │ │
│ │ .NET │ 2-5s │ Slower, consider ReadyToRun │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
│ Mitigation Strategies: │
│ • Provisioned Concurrency (pre-warm containers) │
│ • Smaller deployment packages │
│ • Move initialization outside handler │
│ • Use SnapStart for Java │
│ │
└────────────────────────────────────────────────────────────────────────┘
Lambda Function Best Practices
import json
import boto3
import os
from datetime import datetime
# Initialize OUTSIDE the handler function -- this runs once per container
# during the cold start, then is reused across all warm invocations.
# This single pattern eliminates ~80% of Lambda performance problems.
# Common mistake: creating boto3 clients inside the handler, which forces
# a new TCP connection + STS credential fetch on every single invocation.
dynamodb = boto3.resource( 'dynamodb' )
table = dynamodb.Table(os.environ[ 'TABLE_NAME' ])
def lambda_handler ( event , context ):
"""
Best Practices:
1. Keep handlers small and focused
2. Initialize SDK clients outside handler
3. Use environment variables for configuration
4. Handle errors gracefully
5. Log structured data for observability
"""
# Log incoming event (structured logging)
print (json.dumps({
'level' : 'INFO' ,
'message' : 'Processing request' ,
'request_id' : context.aws_request_id,
'event_type' : event.get( 'httpMethod' , 'unknown' ),
'timestamp' : datetime.utcnow().isoformat()
}))
try :
# Parse input
if 'body' in event and event[ 'body' ]:
body = json.loads(event[ 'body' ])
else :
body = event
# Process request
user_id = body.get( 'user_id' )
if not user_id:
return response( 400 , { 'error' : 'user_id is required' })
# Database operation
result = table.get_item( Key = { 'user_id' : user_id})
if 'Item' not in result:
return response( 404 , { 'error' : 'User not found' })
return response( 200 , result[ 'Item' ])
except json.JSONDecodeError:
return response( 400 , { 'error' : 'Invalid JSON' })
except Exception as e:
# Log error for debugging
print (json.dumps({
'level' : 'ERROR' ,
'message' : str (e),
'request_id' : context.aws_request_id
}))
return response( 500 , { 'error' : 'Internal server error' })
def response ( status_code : int , body : dict ) -> dict :
"""Create API Gateway compatible response."""
return {
'statusCode' : status_code,
'headers' : {
'Content-Type' : 'application/json' ,
'Access-Control-Allow-Origin' : '*'
},
'body' : json.dumps(body)
}
Lambda Limits and Quotas
Resource Limit Notes Timeout 15 minutes Use Step Functions for longer workflows Memory 128 MB - 10 GB More memory = more CPU proportionally Package Size 50 MB (zip), 250 MB (unzipped) Use layers for dependencies Concurrent Executions 1,000 (default) Can request increase Payload Size 6 MB (sync), 256 KB (async) Use S3 for larger payloads Ephemeral Storage 512 MB - 10 GB (/tmp) For temporary files
Lambda with Container Images
# Dockerfile for Lambda container
FROM public.ecr.aws/lambda/python:3.11
# Install dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt --target "${LAMBDA_TASK_ROOT}"
# Copy function code
COPY app.py ${LAMBDA_TASK_ROOT}
# Set the handler
CMD [ "app.lambda_handler" ]
# Build and push to ECR
aws ecr get-login-password | docker login --username AWS --password-stdin $ECR_URI
docker build -t my-lambda .
docker tag my-lambda:latest $ECR_URI /my-lambda:latest
docker push $ECR_URI /my-lambda:latest
ECS (Elastic Container Service)
AWS-native container orchestration for Docker containers.
ECS Architecture Deep Dive
┌────────────────────────────────────────────────────────────────────────┐
│ ECS Architecture │
├────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ ECS Cluster │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ Service │ │ │
│ │ │ Desired Count: 3 Running: 3 Pending: 0 │ │ │
│ │ │ │ │ │
│ │ │ Task Definition: my-app:5 │ │ │
│ │ │ • Container: nginx (256 CPU, 512 MB) │ │ │
│ │ │ • Container: app (512 CPU, 1024 MB) │ │ │
│ │ │ │ │ │
│ │ │ ┌──────────────────────────────────────────────────┐ │ │ │
│ │ │ │ Load Balancer │ │ │ │
│ │ │ │ (ALB with Target Group) │ │ │ │
│ │ │ └─────────────────┬────────────────────────────────┘ │ │ │
│ │ │ ┌───────────┼───────────┐ │ │ │
│ │ │ │ │ │ │ │ │
│ │ │ ▼ ▼ ▼ │ │ │
│ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │
│ │ │ │ Task 1 │ │ Task 2 │ │ Task 3 │ │ │ │
│ │ │ │ (AZ-1a) │ │ (AZ-1b) │ │ (AZ-1c) │ │ │ │
│ │ │ └──────────┘ └──────────┘ └──────────┘ │ │ │
│ │ │ │ │ │
│ │ └─────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ Launch Types: │ │
│ │ ┌────────────────────────┐ ┌────────────────────────┐ │ │
│ │ │ EC2 │ │ Fargate │ │ │
│ │ │ ──────────────────── │ │ ──────────────────── │ │ │
│ │ │ • You manage EC2 │ │ • Serverless │ │ │
│ │ │ • More control │ │ • No EC2 management │ │ │
│ │ │ • Use Reserved/Spot │ │ • Pay per task │ │ │
│ │ │ • GPU workloads │ │ • Faster scaling │ │ │
│ │ └────────────────────────┘ └────────────────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────────────┘
Task Definition Example
{
"family" : "my-web-app" ,
"networkMode" : "awsvpc" ,
"requiresCompatibilities" : [ "FARGATE" ],
"cpu" : "512" ,
"memory" : "1024" ,
"executionRoleArn" : "arn:aws:iam::123456789012:role/ecsTaskExecutionRole" ,
"taskRoleArn" : "arn:aws:iam::123456789012:role/ecsTaskRole" ,
"containerDefinitions" : [
{
"name" : "web" ,
"image" : "123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:latest" ,
"essential" : true ,
"portMappings" : [
{
"containerPort" : 8080 ,
"protocol" : "tcp"
}
],
"environment" : [
{ "name" : "NODE_ENV" , "value" : "production" }
],
"secrets" : [
{
"name" : "DB_PASSWORD" ,
"valueFrom" : "arn:aws:secretsmanager:us-east-1:123:secret:db-password"
}
],
"logConfiguration" : {
"logDriver" : "awslogs" ,
"options" : {
"awslogs-group" : "/ecs/my-web-app" ,
"awslogs-region" : "us-east-1" ,
"awslogs-stream-prefix" : "ecs"
}
},
"healthCheck" : {
"command" : [ "CMD-SHELL" , "curl -f http://localhost:8080/health || exit 1" ],
"interval" : 30 ,
"timeout" : 5 ,
"retries" : 3
}
}
]
}
Auto Scaling
Automatically adjust compute capacity to match demand.
Auto Scaling Strategies
┌────────────────────────────────────────────────────────────────────────┐
│ Auto Scaling Strategies │
├────────────────────────────────────────────────────────────────────────┤
│ │
│ 1. TARGET TRACKING (Recommended) │
│ ────────────────────────────── │
│ "Keep CPU at 50%" │
│ │
│ CPU Usage │
│ 80% ┤ ████ │
│ 60% ┤ █ █ │
│ 50% ┼─────────────────█──────█──────── Target │
│ 40% ┤ █ █ │
│ 20% ┤███████████████ ████ │
│ └────────────────────────────────► Time │
│ │
│ Automatically adds/removes instances to maintain target │
│ │
│ 2. STEP SCALING │
│ ─────────────── │
│ "If CPU > 80%, add 3. If CPU > 60%, add 1." │
│ │
│ Alarm Threshold │ Scaling Action │
│ ────────────────┼──────────────── │
│ CPU < 30% │ Remove 2 instances │
│ CPU 30-50% │ Do nothing │
│ CPU 50-70% │ Add 1 instance │
│ CPU 70-90% │ Add 2 instances │
│ CPU > 90% │ Add 4 instances │
│ │
│ 3. SCHEDULED SCALING │
│ ──────────────────── │
│ "Scale to 10 instances at 9 AM, scale to 3 at 6 PM" │
│ │
│ Instances │
│ 10 ┤ ██████████████████ │
│ 8 ┤ █ █ │
│ 6 ┤ █ █ │
│ 3 ┼███ ███████ │
│ └──────────────────────────────────► Time │
│ 6AM 9AM 5PM 8PM │
│ │
│ 4. PREDICTIVE SCALING │
│ ───────────────────── │
│ Uses ML to predict demand and scale proactively │
│ Best for cyclical patterns (daily, weekly) │
│ │
└────────────────────────────────────────────────────────────────────────┘
# Auto Scaling Group
resource "aws_autoscaling_group" "web" {
name = "web-asg"
vpc_zone_identifier = var . private_subnet_ids
target_group_arns = [ aws_lb_target_group . web . arn ]
health_check_type = "ELB"
min_size = 2
max_size = 10
desired_capacity = 3
launch_template {
id = aws_launch_template . web . id
version = "$Latest"
}
instance_refresh {
strategy = "Rolling"
preferences {
min_healthy_percentage = 50
}
}
tag {
key = "Name"
value = "web-server"
propagate_at_launch = true
}
}
# Target Tracking Policy
resource "aws_autoscaling_policy" "cpu" {
name = "cpu-target-tracking"
autoscaling_group_name = aws_autoscaling_group . web . name
policy_type = "TargetTrackingScaling"
target_tracking_configuration {
predefined_metric_specification {
predefined_metric_type = "ASGAverageCPUUtilization"
}
target_value = 50.0
}
}
# Scale on Request Count
resource "aws_autoscaling_policy" "requests" {
name = "request-target-tracking"
autoscaling_group_name = aws_autoscaling_group . web . name
policy_type = "TargetTrackingScaling"
target_tracking_configuration {
predefined_metric_specification {
predefined_metric_type = "ALBRequestCountPerTarget"
resource_label = " ${ aws_lb . main . arn_suffix } / ${ aws_lb_target_group . web . arn_suffix } "
}
target_value = 1000.0 # 1000 requests per target
}
}
Cost Optimization
Compute Cost Strategies
┌────────────────────────────────────────────────────────────────────────┐
│ Compute Cost Optimization │
├────────────────────────────────────────────────────────────────────────┤
│ │
│ 1. RIGHT-SIZING (15-30% savings) │
│ ───────────────────────────────── │
│ • Use AWS Compute Optimizer recommendations │
│ • Monitor CloudWatch metrics (CPU, memory) │
│ • Downsize underutilized instances │
│ │
│ Example: │
│ m5.xlarge (15% CPU avg) → m5.large = 50% cost reduction │
│ │
│ 2. RESERVED + SAVINGS PLANS (30-72% savings) │
│ ───────────────────────────────────────────── │
│ • Reserved Instances for steady-state │
│ • Savings Plans for flexibility │
│ • Cover 60-80% of baseline with commitments │
│ │
│ 3. SPOT INSTANCES (60-90% savings) │
│ ───────────────────────────────── │
│ Use for: │
│ ✅ CI/CD workers │
│ ✅ Batch processing │
│ ✅ Dev/test environments │
│ ✅ Stateless web servers (behind ASG) │
│ │
│ 4. GRAVITON INSTANCES (40% better price/perf) │
│ ──────────────────────────────────────────── │
│ • m7g, c7g, r7g instance families │
│ • Most applications work without changes │
│ │
│ 5. LAMBDA OPTIMIZATION │
│ ────────────────────── │
│ • Right-size memory (affects CPU) │
│ • Use Graviton (arm64) for 34% lower cost │
│ • Minimize cold starts │
│ │
│ COMBINED STRATEGY EXAMPLE: │
│ ────────────────────────── │
│ Original: 10x m5.xlarge On-Demand = $1,382/month │
│ │
│ Optimized: │
│ • 4x m7g.large Reserved (60% base) = $230/month │
│ • 4x m7g.large Spot (30% variable) = $69/month │
│ • 2x m7g.large On-Demand (10% buffer) = $130/month │
│ Total: $429/month (69% savings!) │
│ │
└────────────────────────────────────────────────────────────────────────┘
🎯 Interview Questions
Q1: When would you choose Lambda over EC2?
Lambda is better when:
Event-driven, short-running tasks (< 15 min)
Unpredictable or spiky traffic
You want zero server management
Cost matters more than consistent latency
EC2 is better when:
Long-running processes
Need specific OS/hardware (GPUs)
Consistent, predictable traffic
Cost optimization with Reserved Instances
Need persistent connections (WebSockets)
Q2: How do you reduce Lambda cold starts?
Strategies:
Provisioned Concurrency - Pre-warm containers
Smaller packages - Reduce initialization time
Initialize outside handler - SDK clients, DB connections
Use lighter runtimes - Go, Python, Node.js
SnapStart for Java - Checkpoint/restore
Keep functions warm - Scheduled pings (not ideal)
Code pattern: # Initialize OUTSIDE handler
dynamodb = boto3.resource( 'dynamodb' )
table = dynamodb.Table(os.environ[ 'TABLE' ])
def handler ( event , context ):
# Only process logic here
return table.get_item( ... )
Q3: ECS vs EKS - when to use each?
Choose ECS when:
AWS-native, simpler setup
Smaller team without K8s expertise
Tighter AWS integration needed
Lower operational overhead
Choose EKS when:
Multi-cloud strategy
Team has Kubernetes expertise
Need K8s ecosystem (Helm, operators)
Complex microservices architectures
Portability is important
Cost Note: EKS adds 0.10 / h o u r p e r c l u s t e r ( 0.10/hour per cluster (~ 0.10/ h o u r p erc l u s t er ( 72/month)
Q4: Design an auto-scaling strategy for an e-commerce site
Multi-layered approach:
Predictive Scaling : Scale up before known peaks (Black Friday)
Target Tracking : Maintain 50% CPU average
Step Scaling : Add extra capacity for sudden spikes
Configuration:
Minimum: 4 instances (2 per AZ)
Desired: 6 instances (normal load)
Maximum: 50 instances (peak capacity)
Policies:
Target tracking: CPU at 50%
Step: +4 if CPU > 80% for 2 min
Scheduled: Scale to 20 at 8 AM, 10 at 10 PM
Predictive: Enable for daily patterns
Cool-downs:
Scale-out: 60 seconds
Scale-in: 300 seconds (avoid thrashing)
Q5: How do you optimize EC2 costs?
Framework (in order of impact):
Right-size (15-30% savings)
Use Compute Optimizer
Monitor actual utilization
Purchase options (30-72% savings)
Reserved for baseline (60-70% of capacity)
Spot for stateless/batch
Savings Plans for flexibility
Instance selection (20-40% savings)
Graviton (ARM) for compatible workloads
Latest generation (m7 vs m5)
Shutdown automation
Stop dev/test outside hours
Auto-scaling to zero when possible
Regular review
Monthly cost reviews
Tag-based cost allocation
🧪 Hands-On Lab: Deploy Scalable Web App
Objective : Deploy a Node.js application with Auto Scaling and Load Balancing
Create Launch Template
Configure EC2 instance with user data for automatic setup
Create Auto Scaling Group
Set min=2, max=6, desired=3 across 2 AZs
Create Application Load Balancer
Configure health checks and target group
Configure Scaling Policies
Add target tracking (CPU 50%) and step scaling
Test Scaling
Generate load with ab or hey and watch scaling in action
Next Module
Storage & Databases Master S3, EBS, RDS, DynamoDB, and ElastiCache