Compute Services - Dev Weekends

Module Overview

Estimated Time: 4-5 hours | Difficulty: Intermediate | Prerequisites: Core Concepts

This module covers all AWS compute services in depth. You’ll learn when to use each service, how to optimize for cost and performance, and real-world architecture patterns. What You’ll Learn:

EC2 instance types, AMIs, and advanced configurations
Lambda functions for serverless computing
Container orchestration with ECS and EKS
Auto Scaling strategies for elasticity
Cost optimization techniques for compute

Compute Service Selection Guide

Choose the right compute service for your workload:

┌──────────────────────────────────────────────────────────────────────┐
│                   AWS Compute Decision Tree                          │
├──────────────────────────────────────────────────────────────────────┤
│                                                                       │
│   What type of workload?                                              │
│         │                                                             │
│         ├─── Short-lived, event-driven ──────► Lambda (Serverless)   │
│         │    (< 15 min, stateless)                                   │
│         │                                                             │
│         ├─── Containers needed ─────┬────────► ECS (AWS Native)      │
│         │                           │                                 │
│         │                           └────────► EKS (Kubernetes)      │
│         │                                                             │
│         ├─── Full control needed ────────────► EC2 (Virtual Servers) │
│         │    (OS, networking, GPUs)                                  │
│         │                                                             │
│         └─── Simple web app ─────────────────► Elastic Beanstalk     │
│              (PaaS)                               or App Runner       │
│                                                                       │
│   Control vs Simplicity Spectrum:                                     │
│   ──────────────────────────────                                      │
│   More Control ◄────────────────────────────────────► Less Management │
│   EC2  │  ECS/EKS  │  Fargate  │  Lambda  │  App Runner              │
│                                                                       │
└──────────────────────────────────────────────────────────────────────┘

EC2 (Elastic Compute Cloud)

Virtual servers in the cloud. The most fundamental and flexible AWS compute service.

Instance Type Deep Dive

AWS offers 500+ instance types optimized for different workloads:

┌──────────────────────────────────────────────────────────────────────┐
│                    EC2 Instance Families                              │
├──────────────────────────────────────────────────────────────────────┤
│                                                                       │
│   GENERAL PURPOSE (M, T)                                              │
│   ──────────────────────                                              │
│   M-series: Balanced compute, memory, networking                      │
│   • m5.large    2 vCPU,  8 GB  → Web servers, small databases        │
│   • m5.xlarge   4 vCPU, 16 GB  → Application servers                 │
│   • m5.4xlarge 16 vCPU, 64 GB  → Medium workloads                    │
│   • m7g.*      Graviton3 (ARM) → 40% better price/performance        │
│                                                                       │
│   T-series: Burstable performance (for variable workloads)           │
│   • t3.micro   2 vCPU,  1 GB   → Free tier, dev/test                 │
│   • t3.medium  2 vCPU,  4 GB   → Light production                    │
│   • t3.xlarge  4 vCPU, 16 GB   → Moderate workloads                  │
│                                                                       │
│   COMPUTE OPTIMIZED (C)                                               │
│   ─────────────────────                                               │
│   High CPU-to-memory ratio for compute-intensive tasks               │
│   • c5.large    2 vCPU,  4 GB  → Batch processing                    │
│   • c5.4xlarge 16 vCPU, 32 GB  → Scientific computing                │
│   • c7g.*      Graviton3       → Best compute price/perf             │
│                                                                       │
│   MEMORY OPTIMIZED (R, X)                                             │
│   ───────────────────────                                             │
│   High memory-to-CPU ratio for in-memory workloads                   │
│   • r5.large    2 vCPU, 16 GB  → Caching, in-memory DB               │
│   • r5.4xlarge 16 vCPU,128 GB  → SAP HANA, Redis                     │
│   • x1e.xlarge  4 vCPU,122 GB  → Extreme memory                      │
│                                                                       │
│   STORAGE OPTIMIZED (I, D)                                            │
│   ────────────────────────                                            │
│   High sequential read/write access to large datasets                │
│   • i3.large    2 vCPU, 15 GB, 475 GB NVMe → Databases              │
│   • d2.xlarge   4 vCPU, 31 GB, 6 TB HDD   → Data warehousing        │
│                                                                       │
│   ACCELERATED COMPUTING (P, G, Inf)                                   │
│   ─────────────────────────────────                                   │
│   GPU and custom hardware for ML/graphics                            │
│   • p4d.24xlarge  8x A100 GPUs → ML training                        │
│   • g4dn.xlarge   1x T4 GPU    → ML inference, graphics             │
│   • inf1.xlarge   4x Inferentia→ Cost-effective inference           │
│                                                                       │
└──────────────────────────────────────────────────────────────────────┘

Instance Naming Convention

# Decoding instance type names
def decode_instance_type(instance_type: str):
    """
    Example: m5dn.2xlarge
    
    m     = Instance family (General Purpose)
    5     = Generation (5th gen, higher = newer)
    d     = Additional capability (NVMe SSD)
    n     = Network optimized
    2xlarge = Size (vCPUs and memory)
    
    Size progression:
    nano → micro → small → medium → large → xlarge → 2xlarge → ... → metal
    """
    
    families = {
        'm': 'General Purpose',
        't': 'Burstable',
        'c': 'Compute Optimized',
        'r': 'Memory Optimized',
        'x': 'Memory Optimized (Extreme)',
        'i': 'Storage Optimized (NVMe)',
        'd': 'Storage Optimized (Dense)',
        'p': 'GPU (Training)',
        'g': 'GPU (Graphics/Inference)',
    }
    
    modifiers = {
        'a': 'AMD processor',
        'g': 'AWS Graviton (ARM)',
        'd': 'NVMe SSD storage',
        'n': 'Network optimized',
        'e': 'Extended memory',
        'z': 'High frequency',
    }
    
    return families.get(instance_type[0], 'Unknown')

Pro Tip: Use Graviton (ARM) instances (m7g, c7g, r7g) for 40% better price/performance on compatible workloads. Most applications work without modification.

T-Series Burstable Instances

T-series instances use CPU credits for burstable performance:

┌────────────────────────────────────────────────────────────────────┐
│                    T3 CPU Credit System                             │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Baseline Performance:                                             │
│   • t3.micro:  10% CPU baseline  (earns 6 credits/hour)            │
│   • t3.small:  20% CPU baseline  (earns 12 credits/hour)           │
│   • t3.medium: 20% CPU baseline  (earns 24 credits/hour)           │
│   • t3.large:  30% CPU baseline  (earns 36 credits/hour)           │
│                                                                     │
│   Credit Usage:                                                     │
│   • 1 credit = 1 vCPU at 100% for 1 minute                         │
│   • Below baseline: Earn credits                                    │
│   • Above baseline: Spend credits                                   │
│   • Credits expire after 24 hours                                   │
│   • Max credit balance: varies by instance size                     │
│                                                                     │
│   CPU Usage Graph:                                                  │
│   100% ┤          ████                                             │
│    80% ┤         █    █          Burst period                      │
│    60% ┤        █      █         (spending credits)                │
│    40% ┤       █        █                                          │
│    20% ┤──────█──────────█──────  ← Baseline                       │
│     0% ┤                                                           │
│        └─────────────────────────► Time                            │
│                                                                     │
│   Modes:                                                            │
│   • Standard: Can burst only with credits                          │
│   • Unlimited: Can burst beyond credits (pay extra)                │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

AMI (Amazon Machine Image)

AMIs are templates containing OS, application server, and applications.

# AMI Selection Best Practices
ami_best_practices = {
    "use_aws_provided": [
        "Amazon Linux 2023",      # AWS optimized, free
        "Ubuntu 22.04 LTS",       # Popular, well-supported
        "Windows Server 2022",    # For .NET workloads
    ],
    
    "create_custom_ami_when": [
        "Need pre-installed software",
        "Custom security hardening",
        "Faster instance boot time",
        "Consistent deployments",
    ],
    
    "golden_ami_pipeline": """
    Base AMI → Install packages → Configure → Test → Create AMI → Share
    
    Automate with:
    - EC2 Image Builder (AWS native)
    - Packer (HashiCorp)
    """,
}

# Launch EC2 with specific AMI
import boto3

ec2 = boto3.client('ec2')

response = ec2.run_instances(
    ImageId='ami-0c55b159cbfafe1f0',  # Amazon Linux 2023
    InstanceType='t3.medium',
    MinCount=1,
    MaxCount=1,
    KeyName='my-key-pair',
    SecurityGroupIds=['sg-0123456789abcdef0'],
    SubnetId='subnet-0123456789abcdef0',
    TagSpecifications=[
        {
            'ResourceType': 'instance',
            'Tags': [
                {'Key': 'Name', 'Value': 'WebServer'},
                {'Key': 'Environment', 'Value': 'Production'},
            ]
        }
    ],
    UserData='''#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
echo "<h1>Hello from $(hostname)</h1>" > /var/www/html/index.html
'''
)

EC2 Instance Metadata Service (IMDS)

Access instance info from within the instance:

# IMDSv2 (recommended - more secure)
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" \
  -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")

# Get instance metadata
curl -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/instance-id

curl -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/public-ipv4

curl -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/iam/security-credentials/MyRole

# Python - Using requests with IMDSv2
import requests

def get_instance_metadata(path: str) -> str:
    """Get EC2 instance metadata using IMDSv2."""
    # Get token
    token_response = requests.put(
        "http://169.254.169.254/latest/api/token",
        headers={"X-aws-ec2-metadata-token-ttl-seconds": "21600"}
    )
    token = token_response.text
    
    # Get metadata
    response = requests.get(
        f"http://169.254.169.254/latest/meta-data/{path}",
        headers={"X-aws-ec2-metadata-token": token}
    )
    return response.text

# Usage
instance_id = get_instance_metadata("instance-id")
public_ip = get_instance_metadata("public-ipv4")
az = get_instance_metadata("placement/availability-zone")

Lambda (Serverless Functions)

Run code without provisioning servers. Pay only for compute time used.

Lambda Architecture

┌────────────────────────────────────────────────────────────────────────┐
│                      Lambda Execution Model                             │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   Request arrives                                                       │
│        │                                                                │
│        ▼                                                                │
│   ┌─────────────────────────────────────────────────────────────────┐  │
│   │                    Lambda Service                                │  │
│   │                                                                  │  │
│   │   Is there a warm container?                                     │  │
│   │        │                                                         │  │
│   │        ├── YES ──► Reuse container (warm start: ~1ms)           │  │
│   │        │                                                         │  │
│   │        └── NO ───► Create container (cold start: 100ms-10s)     │  │
│   │                    │                                             │  │
│   │                    ├── Download code from S3                     │  │
│   │                    ├── Start runtime (Python, Node, etc.)        │  │
│   │                    ├── Run initialization code                   │  │
│   │                    └── Execute handler                           │  │
│   │                                                                  │  │
│   └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│   Cold Start Factors:                                                   │
│   ┌─────────────────────────────────────────────────────────────────┐  │
│   │  Runtime     │ Cold Start │ Notes                               │  │
│   │─────────────────────────────────────────────────────────────────│  │
│   │  Python      │  100-300ms │ Fast, great for most use cases      │  │
│   │  Node.js     │  100-300ms │ Fast, good for APIs                 │  │
│   │  Go          │   50-100ms │ Fastest cold starts                 │  │
│   │  Java        │  3-10s     │ Slow, use SnapStart or GraalVM      │  │
│   │  .NET        │  2-5s      │ Slower, consider ReadyToRun         │  │
│   └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│   Mitigation Strategies:                                                │
│   • Provisioned Concurrency (pre-warm containers)                       │
│   • Smaller deployment packages                                         │
│   • Move initialization outside handler                                 │
│   • Use SnapStart for Java                                              │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

Lambda Function Best Practices

import json
import boto3
import os
from datetime import datetime

# ✅ Initialize outside handler (runs once per container)
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(os.environ['TABLE_NAME'])

def lambda_handler(event, context):
    """
    Best Practices:
    1. Keep handlers small and focused
    2. Initialize SDK clients outside handler
    3. Use environment variables for configuration
    4. Handle errors gracefully
    5. Log structured data for observability
    """
    
    # Log incoming event (structured logging)
    print(json.dumps({
        'level': 'INFO',
        'message': 'Processing request',
        'request_id': context.aws_request_id,
        'event_type': event.get('httpMethod', 'unknown'),
        'timestamp': datetime.utcnow().isoformat()
    }))
    
    try:
        # Parse input
        if 'body' in event and event['body']:
            body = json.loads(event['body'])
        else:
            body = event
        
        # Process request
        user_id = body.get('user_id')
        if not user_id:
            return response(400, {'error': 'user_id is required'})
        
        # Database operation
        result = table.get_item(Key={'user_id': user_id})
        
        if 'Item' not in result:
            return response(404, {'error': 'User not found'})
        
        return response(200, result['Item'])
        
    except json.JSONDecodeError:
        return response(400, {'error': 'Invalid JSON'})
    except Exception as e:
        # Log error for debugging
        print(json.dumps({
            'level': 'ERROR',
            'message': str(e),
            'request_id': context.aws_request_id
        }))
        return response(500, {'error': 'Internal server error'})


def response(status_code: int, body: dict) -> dict:
    """Create API Gateway compatible response."""
    return {
        'statusCode': status_code,
        'headers': {
            'Content-Type': 'application/json',
            'Access-Control-Allow-Origin': '*'
        },
        'body': json.dumps(body)
    }

Lambda Limits and Quotas

Resource	Limit	Notes
Timeout	15 minutes	Use Step Functions for longer workflows
Memory	128 MB - 10 GB	More memory = more CPU proportionally
Package Size	50 MB (zip), 250 MB (unzipped)	Use layers for dependencies
Concurrent Executions	1,000 (default)	Can request increase
Payload Size	6 MB (sync), 256 KB (async)	Use S3 for larger payloads
Ephemeral Storage	512 MB - 10 GB (/tmp)	For temporary files

Lambda with Container Images

# Dockerfile for Lambda container
FROM public.ecr.aws/lambda/python:3.11

# Install dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt --target "${LAMBDA_TASK_ROOT}"

# Copy function code
COPY app.py ${LAMBDA_TASK_ROOT}

# Set the handler
CMD [ "app.lambda_handler" ]

# Build and push to ECR
aws ecr get-login-password | docker login --username AWS --password-stdin $ECR_URI
docker build -t my-lambda .
docker tag my-lambda:latest $ECR_URI/my-lambda:latest
docker push $ECR_URI/my-lambda:latest

ECS (Elastic Container Service)

AWS-native container orchestration for Docker containers.

ECS Architecture Deep Dive

┌────────────────────────────────────────────────────────────────────────┐
│                        ECS Architecture                                 │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   ┌─────────────────────────────────────────────────────────────────┐  │
│   │                         ECS Cluster                              │  │
│   │                                                                  │  │
│   │   ┌─────────────────────────────────────────────────────────┐   │  │
│   │   │                      Service                             │   │  │
│   │   │   Desired Count: 3    Running: 3    Pending: 0          │   │  │
│   │   │                                                          │   │  │
│   │   │   Task Definition: my-app:5                              │   │  │
│   │   │   • Container: nginx (256 CPU, 512 MB)                   │   │  │
│   │   │   • Container: app   (512 CPU, 1024 MB)                  │   │  │
│   │   │                                                          │   │  │
│   │   │   ┌──────────────────────────────────────────────────┐  │   │  │
│   │   │   │              Load Balancer                        │  │   │  │
│   │   │   │        (ALB with Target Group)                    │  │   │  │
│   │   │   └─────────────────┬────────────────────────────────┘  │   │  │
│   │   │         ┌───────────┼───────────┐                       │   │  │
│   │   │         │           │           │                       │   │  │
│   │   │         ▼           ▼           ▼                       │   │  │
│   │   │   ┌──────────┐ ┌──────────┐ ┌──────────┐               │   │  │
│   │   │   │  Task 1  │ │  Task 2  │ │  Task 3  │               │   │  │
│   │   │   │  (AZ-1a) │ │  (AZ-1b) │ │  (AZ-1c) │               │   │  │
│   │   │   └──────────┘ └──────────┘ └──────────┘               │   │  │
│   │   │                                                          │   │  │
│   │   └─────────────────────────────────────────────────────────┘   │  │
│   │                                                                  │  │
│   │   Launch Types:                                                  │  │
│   │   ┌────────────────────────┐  ┌────────────────────────┐        │  │
│   │   │         EC2            │  │       Fargate          │        │  │
│   │   │  ────────────────────  │  │  ────────────────────  │        │  │
│   │   │  • You manage EC2      │  │  • Serverless          │        │  │
│   │   │  • More control        │  │  • No EC2 management   │        │  │
│   │   │  • Use Reserved/Spot   │  │  • Pay per task        │        │  │
│   │   │  • GPU workloads       │  │  • Faster scaling      │        │  │
│   │   └────────────────────────┘  └────────────────────────┘        │  │
│   │                                                                  │  │
│   └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

Task Definition Example

{
  "family": "my-web-app",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "512",
  "memory": "1024",
  "executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
  "taskRoleArn": "arn:aws:iam::123456789012:role/ecsTaskRole",
  "containerDefinitions": [
    {
      "name": "web",
      "image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:latest",
      "essential": true,
      "portMappings": [
        {
          "containerPort": 8080,
          "protocol": "tcp"
        }
      ],
      "environment": [
        {"name": "NODE_ENV", "value": "production"}
      ],
      "secrets": [
        {
          "name": "DB_PASSWORD",
          "valueFrom": "arn:aws:secretsmanager:us-east-1:123:secret:db-password"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/my-web-app",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "healthCheck": {
        "command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
        "interval": 30,
        "timeout": 5,
        "retries": 3
      }
    }
  ]
}

Auto Scaling

Automatically adjust compute capacity to match demand.

Auto Scaling Strategies

┌────────────────────────────────────────────────────────────────────────┐
│                    Auto Scaling Strategies                              │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   1. TARGET TRACKING (Recommended)                                      │
│   ──────────────────────────────                                        │
│   "Keep CPU at 50%"                                                     │
│                                                                         │
│   CPU Usage                                                             │
│   80% ┤                    ████                                        │
│   60% ┤                   █    █                                       │
│   50% ┼─────────────────█──────█──────── Target                        │
│   40% ┤                █        █                                      │
│   20% ┤███████████████          ████                                   │
│       └────────────────────────────────► Time                          │
│                                                                         │
│   Automatically adds/removes instances to maintain target               │
│                                                                         │
│   2. STEP SCALING                                                       │
│   ───────────────                                                       │
│   "If CPU > 80%, add 3. If CPU > 60%, add 1."                          │
│                                                                         │
│   Alarm Threshold │ Scaling Action                                     │
│   ────────────────┼────────────────                                    │
│   CPU < 30%       │ Remove 2 instances                                 │
│   CPU 30-50%      │ Do nothing                                         │
│   CPU 50-70%      │ Add 1 instance                                     │
│   CPU 70-90%      │ Add 2 instances                                    │
│   CPU > 90%       │ Add 4 instances                                    │
│                                                                         │
│   3. SCHEDULED SCALING                                                  │
│   ────────────────────                                                  │
│   "Scale to 10 instances at 9 AM, scale to 3 at 6 PM"                  │
│                                                                         │
│   Instances                                                             │
│   10 ┤     ██████████████████                                         │
│    8 ┤    █                  █                                        │
│    6 ┤   █                    █                                       │
│    3 ┼███                      ███████                                │
│      └──────────────────────────────────► Time                         │
│        6AM  9AM           5PM  8PM                                     │
│                                                                         │
│   4. PREDICTIVE SCALING                                                 │
│   ─────────────────────                                                 │
│   Uses ML to predict demand and scale proactively                      │
│   Best for cyclical patterns (daily, weekly)                           │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

Auto Scaling Configuration (Terraform)

# Auto Scaling Group
resource "aws_autoscaling_group" "web" {
  name                = "web-asg"
  vpc_zone_identifier = var.private_subnet_ids
  target_group_arns   = [aws_lb_target_group.web.arn]
  health_check_type   = "ELB"
  
  min_size         = 2
  max_size         = 10
  desired_capacity = 3
  
  launch_template {
    id      = aws_launch_template.web.id
    version = "$Latest"
  }
  
  instance_refresh {
    strategy = "Rolling"
    preferences {
      min_healthy_percentage = 50
    }
  }
  
  tag {
    key                 = "Name"
    value               = "web-server"
    propagate_at_launch = true
  }
}

# Target Tracking Policy
resource "aws_autoscaling_policy" "cpu" {
  name                   = "cpu-target-tracking"
  autoscaling_group_name = aws_autoscaling_group.web.name
  policy_type            = "TargetTrackingScaling"
  
  target_tracking_configuration {
    predefined_metric_specification {
      predefined_metric_type = "ASGAverageCPUUtilization"
    }
    target_value = 50.0
  }
}

# Scale on Request Count
resource "aws_autoscaling_policy" "requests" {
  name                   = "request-target-tracking"
  autoscaling_group_name = aws_autoscaling_group.web.name
  policy_type            = "TargetTrackingScaling"
  
  target_tracking_configuration {
    predefined_metric_specification {
      predefined_metric_type = "ALBRequestCountPerTarget"
      resource_label         = "${aws_lb.main.arn_suffix}/${aws_lb_target_group.web.arn_suffix}"
    }
    target_value = 1000.0  # 1000 requests per target
  }
}

Cost Optimization

Compute Cost Strategies

┌────────────────────────────────────────────────────────────────────────┐
│                  Compute Cost Optimization                              │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   1. RIGHT-SIZING (15-30% savings)                                      │
│   ─────────────────────────────────                                     │
│   • Use AWS Compute Optimizer recommendations                           │
│   • Monitor CloudWatch metrics (CPU, memory)                            │
│   • Downsize underutilized instances                                    │
│                                                                         │
│   Example:                                                              │
│   m5.xlarge (15% CPU avg) → m5.large = 50% cost reduction             │
│                                                                         │
│   2. RESERVED + SAVINGS PLANS (30-72% savings)                          │
│   ─────────────────────────────────────────────                         │
│   • Reserved Instances for steady-state                                 │
│   • Savings Plans for flexibility                                       │
│   • Cover 60-80% of baseline with commitments                          │
│                                                                         │
│   3. SPOT INSTANCES (60-90% savings)                                    │
│   ─────────────────────────────────                                     │
│   Use for:                                                              │
│   ✅ CI/CD workers                                                      │
│   ✅ Batch processing                                                   │
│   ✅ Dev/test environments                                              │
│   ✅ Stateless web servers (behind ASG)                                 │
│                                                                         │
│   4. GRAVITON INSTANCES (40% better price/perf)                         │
│   ────────────────────────────────────────────                          │
│   • m7g, c7g, r7g instance families                                    │
│   • Most applications work without changes                              │
│                                                                         │
│   5. LAMBDA OPTIMIZATION                                                │
│   ──────────────────────                                                │
│   • Right-size memory (affects CPU)                                     │
│   • Use Graviton (arm64) for 34% lower cost                            │
│   • Minimize cold starts                                                │
│                                                                         │
│   COMBINED STRATEGY EXAMPLE:                                            │
│   ──────────────────────────                                            │
│   Original: 10x m5.xlarge On-Demand = $1,382/month                     │
│                                                                         │
│   Optimized:                                                            │
│   • 4x m7g.large Reserved (60% base) = $230/month                      │
│   • 4x m7g.large Spot (30% variable) = $69/month                       │
│   • 2x m7g.large On-Demand (10% buffer) = $130/month                   │
│   Total: $429/month (69% savings!)                                     │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

🎯 Interview Questions

Q1: When would you choose Lambda over EC2?

Lambda is better when:

Event-driven, short-running tasks (< 15 min)
Unpredictable or spiky traffic
You want zero server management
Cost matters more than consistent latency

EC2 is better when:

Long-running processes
Need specific OS/hardware (GPUs)
Consistent, predictable traffic
Cost optimization with Reserved Instances
Need persistent connections (WebSockets)

Q2: How do you reduce Lambda cold starts?

Strategies:

Provisioned Concurrency - Pre-warm containers
Smaller packages - Reduce initialization time
Initialize outside handler - SDK clients, DB connections
Use lighter runtimes - Go, Python, Node.js
SnapStart for Java - Checkpoint/restore
Keep functions warm - Scheduled pings (not ideal)

Code pattern:

# Initialize OUTSIDE handler
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(os.environ['TABLE'])

def handler(event, context):
    # Only process logic here
    return table.get_item(...)

Q3: ECS vs EKS - when to use each?

Choose ECS when:

AWS-native, simpler setup
Smaller team without K8s expertise
Tighter AWS integration needed
Lower operational overhead

Choose EKS when:

Multi-cloud strategy
Team has Kubernetes expertise
Need K8s ecosystem (Helm, operators)
Complex microservices architectures
Portability is important

Cost Note: EKS adds

0.10/hour per cluster (~

72/month)

Q4: Design an auto-scaling strategy for an e-commerce site

Multi-layered approach:

Predictive Scaling: Scale up before known peaks (Black Friday)
Target Tracking: Maintain 50% CPU average
Step Scaling: Add extra capacity for sudden spikes

Configuration:

Minimum: 4 instances (2 per AZ)
Desired: 6 instances (normal load)
Maximum: 50 instances (peak capacity)

Policies:

Target tracking: CPU at 50%
Step: +4 if CPU > 80% for 2 min
Scheduled: Scale to 20 at 8 AM, 10 at 10 PM
Predictive: Enable for daily patterns

Cool-downs:

Scale-out: 60 seconds
Scale-in: 300 seconds (avoid thrashing)

Q5: How do you optimize EC2 costs?

Framework (in order of impact):

Right-size (15-30% savings)
- Use Compute Optimizer
- Monitor actual utilization
Purchase options (30-72% savings)
- Reserved for baseline (60-70% of capacity)
- Spot for stateless/batch
- Savings Plans for flexibility
Instance selection (20-40% savings)
- Graviton (ARM) for compatible workloads
- Latest generation (m7 vs m5)
Shutdown automation
- Stop dev/test outside hours
- Auto-scaling to zero when possible
Regular review
- Monthly cost reviews
- Tag-based cost allocation

🧪 Hands-On Lab: Deploy Scalable Web App

Objective: Deploy a Node.js application with Auto Scaling and Load Balancing

Create Launch Template

Configure EC2 instance with user data for automatic setup

Create Auto Scaling Group

Set min=2, max=6, desired=3 across 2 AZs

Create Application Load Balancer

Configure health checks and target group

Configure Scaling Policies

Add target tracking (CPU 50%) and step scaling

Test Scaling

Generate load with ab or hey and watch scaling in action

Next Module

Storage & Databases

Master S3, EBS, RDS, DynamoDB, and ElastiCache

Overview

Testing & Code Quality

Crash Courses

AI Engineering

Math for ML - Understanding Linear Algebra

Probability & Statistics for ML

Math for ML - Understanding Calculus

ML Mastery

Deep Learning Mastery

NestJS Mastery

Microservices Mastery

Low Level Design

OOP Concepts

SOLID Principles

Design Patterns

LLD Case Studies

System Design (HLD)

Senior Level (L5+/Staff)

HLD Case Studies

Engineering Fundamentals

DevOps & Operations

Azure Cloud Engineering

AWS Cloud

AWS Monitoring & Observability

AWS Security Services

AWS Serverless

AWS Operations

AWS Advanced

AWS Case Studies

GCP Cloud Engineering

DevOps Tools

Database Engineering

HIPAA Compliance Mastery

Operating Systems

Linux Internals

Distributed Systems

Networking Mastery

Build Your Own X

Go Lang Mastery

C Programming

Classic Research Papers

Distributed System Tools

​Module Overview

​Compute Service Selection Guide

​EC2 (Elastic Compute Cloud)

​Instance Type Deep Dive

​Instance Naming Convention

​T-Series Burstable Instances

​AMI (Amazon Machine Image)

​EC2 Instance Metadata Service (IMDS)

​Lambda (Serverless Functions)

​Lambda Architecture

​Lambda Function Best Practices

​Lambda Limits and Quotas

​Lambda with Container Images

​ECS (Elastic Container Service)

​ECS Architecture Deep Dive

​Task Definition Example

​Auto Scaling

​Auto Scaling Strategies

​Auto Scaling Configuration (Terraform)

​Cost Optimization

​Compute Cost Strategies

​🎯 Interview Questions

​🧪 Hands-On Lab: Deploy Scalable Web App

​Next Module

Storage & Databases

Module Overview

Compute Service Selection Guide

EC2 (Elastic Compute Cloud)

Instance Type Deep Dive

Instance Naming Convention

T-Series Burstable Instances

AMI (Amazon Machine Image)

EC2 Instance Metadata Service (IMDS)

Lambda (Serverless Functions)

Lambda Architecture

Lambda Function Best Practices

Lambda Limits and Quotas

Lambda with Container Images

ECS (Elastic Container Service)

ECS Architecture Deep Dive

Task Definition Example

Auto Scaling

Auto Scaling Strategies

Auto Scaling Configuration (Terraform)

Cost Optimization

Compute Cost Strategies

🎯 Interview Questions

🧪 Hands-On Lab: Deploy Scalable Web App

Next Module