Skip to main content
S3 Storage Classes

Module Overview

Estimated Time: 5-6 hours | Difficulty: Intermediate | Prerequisites: Core Concepts, Compute
This module covers all AWS storage and database services. You’ll learn data modeling, performance optimization, cost management, and when to use each service. What You’ll Learn:
  • S3 storage classes, lifecycle policies, and security
  • EBS volumes and snapshots for EC2
  • EFS for shared file systems
  • RDS and Aurora for relational databases
  • DynamoDB for NoSQL at scale
  • ElastiCache for sub-millisecond response times
  • Data migration strategies

Storage Service Selection Guide

┌────────────────────────────────────────────────────────────────────────┐
│                  AWS Storage Decision Tree                              │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   What type of data access pattern?                                     │
│         │                                                               │
│         ├─── Random access to blocks ──────────► EBS (EC2 volumes)     │
│         │    (Database, boot volumes)               │                   │
│         │                                           └── gp3: General   │
│         │                                           └── io2: High IOPS │
│         │                                                               │
│         ├─── Object/file storage ──────────────► S3 (Object Storage)   │
│         │    (Any size, any format)                 │                   │
│         │                                           └── Standard: Hot  │
│         │                                           └── IA: Warm       │
│         │                                           └── Glacier: Cold  │
│         │                                                               │
│         ├─── Shared file system (Linux) ───────► EFS (NFS)             │
│         │    (Multiple EC2, containers)                                 │
│         │                                                               │
│         └─── Shared file system (Windows) ─────► FSx for Windows       │
│              (AD integration, SMB)                                      │
│                                                                         │
│   DATABASE Decision:                                                    │
│   ──────────────────                                                    │
│         │                                                               │
│         ├─── Relational, complex queries ──────► RDS / Aurora          │
│         │    (Joins, transactions, ACID)                                │
│         │                                                               │
│         ├─── Key-value, high scale ────────────► DynamoDB              │
│         │    (Single-digit ms, serverless)                              │
│         │                                                               │
│         ├─── Document store ───────────────────► DocumentDB (MongoDB)  │
│         │                                                               │
│         ├─── Graph relationships ──────────────► Neptune              │
│         │                                                               │
│         └─── Caching layer ────────────────────► ElastiCache           │
│              (Redis/Memcached)                                          │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

Storage Types Comparison

┌───────────────────────────────────────────────────────────────────────┐
│                      AWS Storage Types                                 │
├───────────────────────────────────────────────────────────────────────┤
│                                                                        │
│   BLOCK STORAGE           OBJECT STORAGE          FILE STORAGE         │
│   ─────────────           ─────────────           ─────────────        │
│   ┌─────────────┐         ┌─────────────┐         ┌─────────────┐      │
│   │    EBS      │         │     S3      │         │    EFS      │      │
│   │  (Volumes)  │         │  (Buckets)  │         │   (NFS)     │      │
│   └─────────────┘         └─────────────┘         └─────────────┘      │
│         │                       │                       │              │
│   Characteristics:         Characteristics:       Characteristics:     │
│   • Like a hard disk       • Like Dropbox         • Like network share │
│   • Attach to 1 EC2        • HTTP access          • Multiple EC2s      │
│   • Single AZ              • Global access        • Multi-AZ           │
│   • Low latency            • Unlimited size       • POSIX compliant    │
│   • Boot volumes           • 11 9s durability     • Auto-scaling       │
│                                                                        │
│   Use Cases:               Use Cases:             Use Cases:           │
│   • Databases              • Static websites      • Content management │
│   • Enterprise apps        • Data lakes           • Web serving        │
│   • Boot/root volumes      • Backups/archives     • Container storage  │
│   • High-perf workloads    • Media hosting        • Dev environments   │
│                                                                        │
│   Durability & Performance:                                            │
│   ─────────────────────────                                            │
│   EBS: 99.999% (single AZ), <1ms latency                              │
│   S3:  99.999999999% (11 9s), ~100ms latency                          │
│   EFS: 99.999999999%, few ms latency                                  │
│                                                                        │
└───────────────────────────────────────────────────────────────────────┘

S3 (Simple Storage Service)

Object storage with unlimited capacity. The backbone of AWS storage and data lakes.

S3 Architecture Deep Dive

┌──────────────────────────────────────────────────────────────────────┐
│                         S3 Architecture                               │
├──────────────────────────────────────────────────────────────────────┤
│                                                                       │
│   Bucket: my-company-data-prod (globally unique name)                │
│   Region: us-east-1                                                   │
│   ─────────────────────────────────────────────────────────────────  │
│                                                                       │
│   Object Structure:                                                   │
│   ┌─────────────────────────────────────────────────────────────┐    │
│   │  Key: images/2024/01/photo.jpg                               │    │
│   │  ┌────────────────────────────────────────────────────────┐ │    │
│   │  │  Value: [binary data up to 5 TB]                        │ │    │
│   │  │                                                         │ │    │
│   │  │  Metadata:                                              │ │    │
│   │  │  • System: Content-Type, Last-Modified, ETag            │ │    │
│   │  │  • User-defined: x-amz-meta-*                           │ │    │
│   │  │                                                         │ │    │
│   │  │  Version ID: 3sL4kqtJlcpXroDTDmJ+rmSpXd3dIbrHY+MTRCxf3  │ │    │
│   │  └────────────────────────────────────────────────────────┘ │    │
│   └─────────────────────────────────────────────────────────────┘    │
│                                                                       │
│   Limits & Capabilities:                                              │
│   • Max object size: 5 TB                                            │
│   • Max single PUT: 5 GB (use multipart for larger)                  │
│   • Max parts in multipart: 10,000                                   │
│   • Recommended multipart for > 100 MB                               │
│   • Unlimited objects per bucket                                      │
│   • Bucket names: 3-63 chars, globally unique                        │
│                                                                       │
│   Data Consistency (as of Dec 2020):                                 │
│   • Strong read-after-write consistency for all operations           │
│   • No eventual consistency delays anymore!                          │
│                                                                       │
└──────────────────────────────────────────────────────────────────────┘

S3 Storage Classes Deep Dive

┌────────────────────────────────────────────────────────────────────────┐
│                     S3 Storage Classes                                  │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   Storage Class     │ Access         │ Min Storage │ Retrieval │ Cost  │
│   ─────────────────────────────────────────────────────────────────────│
│   S3 Standard       │ Milliseconds   │ None        │ None      │ $$$$  │
│   ─────────────────────────────────────────────────────────────────────│
│   S3 Standard-IA    │ Milliseconds   │ 30 days     │ Per GB    │ $$    │
│   ─────────────────────────────────────────────────────────────────────│
│   S3 One Zone-IA    │ Milliseconds   │ 30 days     │ Per GB    │ $     │
│   ─────────────────────────────────────────────────────────────────────│
│   S3 Intelligent    │ Milliseconds   │ None        │ None      │ Auto  │
│   Tiering           │                │             │ monitoring│       │
│   ─────────────────────────────────────────────────────────────────────│
│   Glacier Instant   │ Milliseconds   │ 90 days     │ Per GB    │ ¢¢    │
│   Retrieval         │                │             │           │       │
│   ─────────────────────────────────────────────────────────────────────│
│   Glacier Flexible  │ 1-5 min to     │ 90 days     │ Per GB +  │ ¢     │
│   Retrieval         │ 12 hours       │             │ per req   │       │
│   ─────────────────────────────────────────────────────────────────────│
│   Glacier Deep      │ 12-48 hours    │ 180 days    │ Per GB +  │ ¢     │
│   Archive           │                │             │ per req   │       │
│   ─────────────────────────────────────────────────────────────────────│
│                                                                         │
│   Cost Comparison (per GB/month, us-east-1):                           │
│   Standard:       $0.023                                               │
│   Standard-IA:    $0.0125 (+ $0.01/GB retrieval)                       │
│   One Zone-IA:    $0.01   (+ $0.01/GB retrieval)                       │
│   Glacier IR:     $0.004  (+ $0.03/GB retrieval)                       │
│   Glacier FR:     $0.0036 (+ $0.01-$0.03/GB retrieval)                 │
│   Deep Archive:   $0.00099 (+ $0.02/GB retrieval)                      │
│                                                                         │
│   USE INTELLIGENT TIERING when access patterns are unknown!            │
│   It automatically moves objects between tiers based on access.        │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

S3 Lifecycle Policies

# Example lifecycle policy configuration
import boto3
import json

s3 = boto3.client('s3')

lifecycle_policy = {
    "Rules": [
        {
            "ID": "MoveToIA30Days",
            "Status": "Enabled",
            "Filter": {"Prefix": "logs/"},
            "Transitions": [
                {
                    "Days": 30,
                    "StorageClass": "STANDARD_IA"
                },
                {
                    "Days": 90,
                    "StorageClass": "GLACIER"
                },
                {
                    "Days": 365,
                    "StorageClass": "DEEP_ARCHIVE"
                }
            ],
            "Expiration": {
                "Days": 2555  # Delete after 7 years
            }
        },
        {
            "ID": "DeleteIncompleteMultipartUploads",
            "Status": "Enabled",
            "Filter": {"Prefix": ""},
            "AbortIncompleteMultipartUpload": {
                "DaysAfterInitiation": 7
            }
        },
        {
            "ID": "ExpireOldVersions",
            "Status": "Enabled",
            "Filter": {"Prefix": ""},
            "NoncurrentVersionTransitions": [
                {
                    "NoncurrentDays": 30,
                    "StorageClass": "GLACIER"
                }
            ],
            "NoncurrentVersionExpiration": {
                "NoncurrentDays": 365
            }
        }
    ]
}

s3.put_bucket_lifecycle_configuration(
    Bucket='my-bucket',
    LifecycleConfiguration=lifecycle_policy
)

S3 Security Best Practices

┌────────────────────────────────────────────────────────────────────────┐
│                     S3 Security Layers                                  │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   Layer 1: BLOCK PUBLIC ACCESS (Account & Bucket Level)               │
│   ─────────────────────────────────────────────────────                │
│   Enable all four settings by default:                                 │
│   ✅ BlockPublicAcls                                                   │
│   ✅ IgnorePublicAcls                                                  │
│   ✅ BlockPublicPolicy                                                 │
│   ✅ RestrictPublicBuckets                                             │
│                                                                         │
│   Layer 2: BUCKET POLICY (Resource-Based)                              │
│   ────────────────────────────────────────                             │
│   {                                                                     │
│     "Version": "2012-10-17",                                           │
│     "Statement": [{                                                     │
│       "Sid": "EnforceTLS",                                             │
│       "Effect": "Deny",                                                │
│       "Principal": "*",                                                │
│       "Action": "s3:*",                                                │
│       "Resource": ["arn:aws:s3:::bucket/*"],                           │
│       "Condition": {                                                    │
│         "Bool": {"aws:SecureTransport": "false"}                       │
│       }                                                                 │
│     }]                                                                  │
│   }                                                                     │
│                                                                         │
│   Layer 3: IAM POLICIES (Identity-Based)                               │
│   ──────────────────────────────────────                               │
│   Attached to users, groups, or roles                                  │
│                                                                         │
│   Layer 4: ENCRYPTION                                                   │
│   ───────────────────                                                   │
│   • SSE-S3: AWS managed keys (default, free)                           │
│   • SSE-KMS: Customer managed keys (audit trail)                       │
│   • SSE-C: Customer provided keys                                      │
│   • Client-side: Encrypt before upload                                 │
│                                                                         │
│   Layer 5: ACCESS POINTS                                               │
│   ──────────────────────                                               │
│   Simplified access management for multi-tenant buckets               │
│                                                                         │
│   Layer 6: OBJECT LOCK                                                 │
│   ─────────────────────                                                │
│   WORM (Write Once Read Many) for compliance                          │
│   • Governance mode: Can be overridden with special permissions       │
│   • Compliance mode: Cannot be overridden by anyone                   │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

S3 Performance Optimization

import boto3
from concurrent.futures import ThreadPoolExecutor
import os

s3 = boto3.client('s3')

class S3Uploader:
    """High-performance S3 uploader with best practices."""
    
    def __init__(self, bucket: str):
        self.bucket = bucket
        # Configure transfer settings for optimal performance
        self.config = boto3.s3.transfer.TransferConfig(
            multipart_threshold=8 * 1024 * 1024,  # 8 MB
            max_concurrency=10,
            multipart_chunksize=8 * 1024 * 1024,  # 8 MB
            use_threads=True
        )
    
    def upload_file(self, local_path: str, s3_key: str):
        """Upload with automatic multipart handling."""
        s3.upload_file(
            local_path, 
            self.bucket, 
            s3_key,
            Config=self.config
        )
    
    def upload_large_file_optimized(self, local_path: str, s3_key: str):
        """
        Best practices for large file uploads:
        1. Use multipart upload (automatic for > 8 MB)
        2. Use S3 Transfer Acceleration for cross-region
        3. Use byte-range requests for downloads
        """
        # Enable Transfer Acceleration for the bucket first
        # aws s3api put-bucket-accelerate-configuration \
        #   --bucket my-bucket --accelerate-configuration Status=Enabled
        
        s3_accelerate = boto3.client(
            's3',
            endpoint_url='https://s3-accelerate.amazonaws.com'
        )
        
        s3_accelerate.upload_file(
            local_path,
            self.bucket,
            s3_key,
            Config=self.config
        )

# S3 Performance Tips:
# 1. Use random prefixes for high request rates (3,500+ PUT/s)
#    Instead of: logs/2024/01/15/file1.log
#    Use:        logs/a7b3c1d2/2024/01/15/file1.log
#
# 2. Enable S3 Transfer Acceleration for upload speeds
#    20-200% faster for long-distance transfers
#
# 3. Use byte-range fetches for parallel downloads
#
# 4. S3 can handle 5,500+ GET requests/second per prefix
#
# 5. Use S3 Select to retrieve only needed data (up to 400% faster)

Presigned URLs for Secure Sharing

import boto3
from datetime import datetime, timedelta

s3 = boto3.client('s3')

def generate_presigned_download_url(bucket: str, key: str, 
                                     expiry_seconds: int = 3600) -> str:
    """Generate a presigned URL for downloading."""
    url = s3.generate_presigned_url(
        'get_object',
        Params={
            'Bucket': bucket,
            'Key': key
        },
        ExpiresIn=expiry_seconds
    )
    return url

def generate_presigned_upload_url(bucket: str, key: str,
                                   content_type: str = 'application/octet-stream',
                                   expiry_seconds: int = 3600) -> dict:
    """Generate a presigned URL for uploading."""
    response = s3.generate_presigned_post(
        Bucket=bucket,
        Key=key,
        Fields={
            'Content-Type': content_type
        },
        Conditions=[
            {'Content-Type': content_type},
            ['content-length-range', 1, 104857600]  # 1 byte to 100 MB
        ],
        ExpiresIn=expiry_seconds
    )
    return response  # Returns {'url': ..., 'fields': {...}}

# Usage in API response
download_url = generate_presigned_download_url('my-bucket', 'reports/q1.pdf')
upload_info = generate_presigned_upload_url('my-bucket', 'uploads/user123/file.jpg')

EBS (Elastic Block Store)

Persistent block storage for EC2 instances. Like a high-performance SSD/HDD attached to your server.

EBS Volume Types Deep Dive

┌────────────────────────────────────────────────────────────────────────┐
│                      EBS Volume Types                                   │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   SSD-BASED (Random I/O):                                              │
│   ─────────────────────────                                            │
│                                                                         │
│   gp3 (General Purpose SSD) - RECOMMENDED DEFAULT                      │
│   ├── Baseline: 3,000 IOPS, 125 MB/s                                   │
│   ├── Max: 16,000 IOPS, 1,000 MB/s                                     │
│   ├── Size: 1 GB - 16 TB                                               │
│   ├── Cost: $0.08/GB + $0.005/IOPS (above 3K)                          │
│   └── Use: Boot volumes, most workloads                                │
│                                                                         │
│   gp2 (Previous Gen) - LEGACY                                          │
│   ├── IOPS: 3 IOPS/GB (burst to 3,000)                                 │
│   ├── Max: 16,000 IOPS                                                 │
│   └── Note: gp3 is usually more cost-effective                         │
│                                                                         │
│   io2 Block Express (Provisioned IOPS) - HIGHEST PERFORMANCE           │
│   ├── Max: 256,000 IOPS, 4,000 MB/s                                    │
│   ├── Latency: Sub-millisecond                                         │
│   ├── Durability: 99.999% (5 9s vs 99.9% for others)                   │
│   ├── Size: 4 GB - 64 TB                                               │
│   ├── Cost: $0.125/GB + $0.065/IOPS                                    │
│   └── Use: Databases (Oracle, SAP), critical I/O workloads            │
│                                                                         │
│   io1 (Provisioned IOPS) - LEGACY                                      │
│   └── Max: 64,000 IOPS, use io2 instead                                │
│                                                                         │
│   HDD-BASED (Sequential I/O):                                          │
│   ──────────────────────────                                           │
│                                                                         │
│   st1 (Throughput Optimized HDD)                                       │
│   ├── Max: 500 IOPS, 500 MB/s                                          │
│   ├── Size: 125 GB - 16 TB                                             │
│   ├── Cost: $0.045/GB                                                  │
│   └── Use: Big data, log processing, data warehouses                  │
│                                                                         │
│   sc1 (Cold HDD) - LOWEST COST                                         │
│   ├── Max: 250 IOPS, 250 MB/s                                          │
│   ├── Size: 125 GB - 16 TB                                             │
│   ├── Cost: $0.015/GB                                                  │
│   └── Use: Infrequent access, archival                                 │
│                                                                         │
│   BOOT VOLUME ELIGIBILITY:                                             │
│   • gp2, gp3, io1, io2 ✅ Can be boot volumes                          │
│   • st1, sc1 ❌ Cannot be boot volumes                                 │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

EBS Snapshots and Backup

import boto3
from datetime import datetime, timedelta

ec2 = boto3.client('ec2')

def create_ebs_snapshot(volume_id: str, description: str = None):
    """Create EBS snapshot (incremental backup to S3)."""
    response = ec2.create_snapshot(
        VolumeId=volume_id,
        Description=description or f"Backup-{datetime.now().isoformat()}",
        TagSpecifications=[
            {
                'ResourceType': 'snapshot',
                'Tags': [
                    {'Key': 'Name', 'Value': 'AutomatedBackup'},
                    {'Key': 'CreatedBy', 'Value': 'Python Script'},
                ]
            }
        ]
    )
    return response['SnapshotId']

def copy_snapshot_cross_region(snapshot_id: str, 
                                source_region: str,
                                destination_region: str):
    """Copy snapshot to another region for DR."""
    dest_ec2 = boto3.client('ec2', region_name=destination_region)
    
    response = dest_ec2.copy_snapshot(
        SourceSnapshotId=snapshot_id,
        SourceRegion=source_region,
        Description=f"DR copy from {source_region}",
        Encrypted=True  # Always encrypt DR copies
    )
    return response['SnapshotId']

def create_volume_from_snapshot(snapshot_id: str, 
                                 az: str,
                                 volume_type: str = 'gp3'):
    """Create new volume from snapshot (for restore or DR)."""
    response = ec2.create_volume(
        SnapshotId=snapshot_id,
        AvailabilityZone=az,
        VolumeType=volume_type,
        Encrypted=True,
        TagSpecifications=[
            {
                'ResourceType': 'volume',
                'Tags': [
                    {'Key': 'Name', 'Value': 'RestoredVolume'},
                ]
            }
        ]
    )
    return response['VolumeId']

# Key Points:
# 1. Snapshots are incremental (only changed blocks stored)
# 2. Snapshots stored in S3 (managed by AWS, not visible)
# 3. Can create volumes in any AZ from snapshot
# 4. First snapshot takes longest, subsequent are fast
# 5. Deleting snapshots only removes data not needed by others

EBS Encryption

┌────────────────────────────────────────────────────────────────────────┐
│                      EBS Encryption                                     │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   What's encrypted:                                                     │
│   ✅ Data at rest on volume                                            │
│   ✅ Data in transit between EC2 and EBS                               │
│   ✅ All snapshots                                                      │
│   ✅ All volumes created from encrypted snapshots                       │
│                                                                         │
│   Encryption Keys:                                                      │
│   • Default: AWS managed key (aws/ebs)                                 │
│   • Custom: Your KMS CMK (better audit, control)                       │
│                                                                         │
│   Converting Unencrypted to Encrypted:                                 │
│   ┌──────────────────────────────────────────────────────────────┐    │
│   │  1. Create snapshot of unencrypted volume                     │    │
│   │  2. Copy snapshot with encryption enabled                     │    │
│   │  3. Create encrypted volume from encrypted snapshot           │    │
│   │  4. Attach new volume, migrate data, detach old               │    │
│   └──────────────────────────────────────────────────────────────┘    │
│                                                                         │
│   Best Practice: Enable "Encrypt new EBS volumes by default"          │
│   in EC2 → EBS → Settings                                              │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘
EBS is AZ-specific! Volumes only exist in one AZ. To move across AZs or regions:
  1. Create snapshot
  2. Copy to target region (if cross-region)
  3. Create volume in target AZ from snapshot

EFS (Elastic File System)

Managed NFS file system that scales automatically. Shared storage for Linux workloads.
┌────────────────────────────────────────────────────────────────────────┐
│                         EFS Architecture                                │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   ┌───────────────────────────────────────────────────────────────┐    │
│   │                    EFS File System                             │    │
│   │                   /efs-shared-data                             │    │
│   │                                                                │    │
│   │   Storage Classes:                                             │    │
│   │   ┌────────────────┐  ┌────────────────┐                      │    │
│   │   │ Standard       │  │ Standard-IA    │                      │    │
│   │   │ (Frequent)     │  │ (Infrequent)   │                      │    │
│   │   │ $0.30/GB       │  │ $0.016/GB +    │                      │    │
│   │   │                │  │ $0.01/access   │                      │    │
│   │   └────────────────┘  └────────────────┘                      │    │
│   │                                                                │    │
│   │   Lifecycle: Auto-move to IA after 7/14/30/60/90 days         │    │
│   │                                                                │    │
│   └───────────────────────────────────────────────────────────────┘    │
│                         │                                               │
│            ┌────────────┼────────────┬────────────┐                    │
│            │            │            │            │                    │
│            ▼            ▼            ▼            ▼                    │
│   ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐     │
│   │Mount Target │ │Mount Target │ │Mount Target │ │ Lambda      │     │
│   │  (AZ-1a)    │ │  (AZ-1b)    │ │  (AZ-1c)    │ │ (via VPC)   │     │
│   └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘     │
│         │               │               │               │              │
│      EC2-1           EC2-2           EC2-3          Lambda            │
│     (/mnt/efs)     (/mnt/efs)      (/mnt/efs)     Functions           │
│                                                                        │
│   ┌─────────────────────────────────────────────────────────────┐     │
│   │  Performance Modes:                                          │     │
│   │  • General Purpose: Low latency (default)                    │     │
│   │  • Max I/O: Higher latency, higher throughput (big data)    │     │
│   │                                                              │     │
│   │  Throughput Modes:                                           │     │
│   │  • Bursting: Scales with size (50 MB/s per TB)              │     │
│   │  • Provisioned: Set fixed throughput (1-3000+ MB/s)         │     │
│   │  • Elastic: Auto-scales (up to 10+ GB/s reads)              │     │
│   └─────────────────────────────────────────────────────────────┘     │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

EFS vs EBS vs S3

FeatureEFSEBSS3
TypeFile (NFS)BlockObject
AccessMulti-AZ, Multi-EC2Single AZ, Single EC2Global, HTTP
ScalingAutomatic (petabytes)Manual (16 TB max)Unlimited
Cost$0.30/GB (Standard)$0.08-0.125/GB$0.023/GB
Latency~mssub-ms~100ms
Use CaseShared filesDatabasesStatic content

RDS (Relational Database Service)

Managed relational databases: MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, Aurora.

RDS Architecture and Features

┌────────────────────────────────────────────────────────────────────────┐
│                      RDS Architecture                                   │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   ┌───────────────────────────────────────────────────────────────┐    │
│   │                    RDS Instance                                │    │
│   │                                                                │    │
│   │   What AWS Manages:             What You Manage:               │    │
│   │   ✅ Hardware provisioning      ✅ Schema design               │    │
│   │   ✅ OS patching                ✅ Query optimization          │    │
│   │   ✅ Database patching          ✅ Index creation              │    │
│   │   ✅ Automated backups          ✅ Application tuning          │    │
│   │   ✅ Multi-AZ failover          ✅ Security groups             │    │
│   │   ✅ Scaling                    ✅ Parameter groups            │    │
│   │                                                                │    │
│   └───────────────────────────────────────────────────────────────┘    │
│                                                                         │
│   Supported Engines:                                                   │
│   ┌─────────────┬────────────────────────────────────────────────┐    │
│   │ MySQL       │ 5.7, 8.0 - Most popular open source           │    │
│   │ PostgreSQL  │ 11-16 - Advanced features, extensions          │    │
│   │ MariaDB     │ 10.x - MySQL fork, community driven            │    │
│   │ Oracle      │ Enterprise, Standard - BYOL or License Included│    │
│   │ SQL Server  │ Express, Web, Standard, Enterprise             │    │
│   │ Aurora      │ MySQL/PostgreSQL compatible, 5x faster         │    │
│   └─────────────┴────────────────────────────────────────────────┘    │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

Multi-AZ vs Read Replicas

┌────────────────────────────────────────────────────────────────────────┐
│                    Multi-AZ Deployment                                  │
│                   (High Availability)                                   │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   Purpose: AUTOMATIC FAILOVER for disaster recovery                    │
│                                                                         │
│   ┌─────────────────────┐  Synchronous  ┌─────────────────────┐       │
│   │    Primary DB       │◄─────────────►│    Standby DB       │       │
│   │    (AZ-1a)          │  Replication  │    (AZ-1b)          │       │
│   │                     │               │                     │       │
│   │  ✅ All reads       │               │  ❌ No reads        │       │
│   │  ✅ All writes      │               │  ❌ No writes       │       │
│   └─────────────────────┘               └─────────────────────┘       │
│            │                                    │                      │
│            │    Automatic DNS failover          │                      │
│            │    (60-120 seconds)                │                      │
│            │                                    │                      │
│            ▼                                    ▼                      │
│   ┌─────────────────────────────────────────────────────────────┐     │
│   │  Application connects to: mydb.abc123.us-east-1.rds.aws    │     │
│   │  (Single endpoint, AWS handles failover automatically)      │     │
│   └─────────────────────────────────────────────────────────────┘     │
│                                                                         │
│   When failover happens:                                               │
│   • AZ outage or instance failure                                      │
│   • Instance type change                                               │
│   • Manual failover (for testing)                                      │
│   • OS patching                                                        │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────────────────────────────────────────────┐
│                    Read Replicas                                        │
│                   (Read Scaling)                                        │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   Purpose: SCALE READS by offloading queries                           │
│                                                                         │
│   ┌─────────────────────┐ Asynchronous ┌─────────────────────┐        │
│   │    Primary DB       │─────────────►│   Read Replica 1    │        │
│   │  (writes + reads)   │              │   (reads only)      │        │
│   └─────────────────────┘              └─────────────────────┘        │
│            │                                                           │
│            │              Async        ┌─────────────────────┐        │
│            └──────────────────────────►│   Read Replica 2    │        │
│                                        │   (reads only)      │        │
│                                        └─────────────────────┘        │
│                                                                         │
│   Key Points:                                                          │
│   • Up to 5 replicas (15 for Aurora)                                   │
│   • Can be cross-region (for DR or local reads)                       │
│   • Each replica has own endpoint                                      │
│   • Replication is ASYNC (eventual consistency)                        │
│   • Can be promoted to standalone DB                                   │
│                                                                         │
│   Application Pattern:                                                 │
│   ┌─────────────────────────────────────────────────────────────┐     │
│   │  def get_connection(is_read_only=False):                     │     │
│   │      if is_read_only:                                        │     │
│   │          return connect("replica.abc123.us-east-1.rds.aws") │     │
│   │      return connect("primary.abc123.us-east-1.rds.aws")     │     │
│   └─────────────────────────────────────────────────────────────┘     │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

RDS Backup and Recovery

import boto3
from datetime import datetime

rds = boto3.client('rds')

# Automated Backups (AWS Managed)
# - Daily during backup window
# - Transaction logs every 5 minutes
# - Retention: 0-35 days (0 disables)
# - Point-in-time recovery to any second

def restore_to_point_in_time(source_db: str, 
                              target_db: str,
                              restore_time: datetime):
    """Restore RDS to a specific point in time."""
    response = rds.restore_db_instance_to_point_in_time(
        SourceDBInstanceIdentifier=source_db,
        TargetDBInstanceIdentifier=target_db,
        RestoreTime=restore_time,
        UseLatestRestorableTime=False,
        DBInstanceClass='db.t3.medium',
        PubliclyAccessible=False,
        MultiAZ=True,
    )
    return response

# Manual Snapshots
# - User-initiated
# - Kept until you delete them
# - Can share across accounts/regions

def create_manual_snapshot(db_identifier: str):
    """Create manual DB snapshot."""
    snapshot_id = f"{db_identifier}-{datetime.now().strftime('%Y%m%d-%H%M%S')}"
    response = rds.create_db_snapshot(
        DBInstanceIdentifier=db_identifier,
        DBSnapshotIdentifier=snapshot_id,
        Tags=[
            {'Key': 'Purpose', 'Value': 'ManualBackup'},
        ]
    )
    return snapshot_id

Aurora

AWS’s cloud-native relational database. MySQL and PostgreSQL compatible with 5x better performance.

Aurora Architecture Deep Dive

┌────────────────────────────────────────────────────────────────────────┐
│                      Aurora Architecture                                │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   ┌─────────────────────────────────────────────────────────────────┐  │
│   │                      Aurora Cluster                              │  │
│   │                                                                  │  │
│   │   Writer Endpoint: mydb.cluster-abc123.us-east-1.rds.aws        │  │
│   │   Reader Endpoint: mydb.cluster-ro-abc123.us-east-1.rds.aws     │  │
│   │                                                                  │  │
│   │   ┌───────────────────┐   ┌───────────────────┐                 │  │
│   │   │  Writer Instance  │   │ Reader Instance 1 │                 │  │
│   │   │    (Primary)      │   │    (Replica)      │                 │  │
│   │   │     (AZ-1a)       │   │     (AZ-1b)       │                 │  │
│   │   └─────────┬─────────┘   └─────────┬─────────┘                 │  │
│   │             │                       │                            │  │
│   │             └───────────┬───────────┘                            │  │
│   │                         │                                        │  │
│   │                         ▼                                        │  │
│   │   ┌─────────────────────────────────────────────────────────┐   │  │
│   │   │              Shared Cluster Storage                      │   │  │
│   │   │           (Aurora Storage Engine)                        │   │  │
│   │   │                                                          │   │  │
│   │   │   ┌─────┐  ┌─────┐  ┌─────┐  ┌─────┐  ┌─────┐  ┌─────┐  │   │  │
│   │   │   │10GB │  │10GB │  │10GB │  │10GB │  │10GB │  │10GB │  │   │  │
│   │   │   │AZ-1a│  │AZ-1b│  │AZ-1c│  │AZ-1a│  │AZ-1b│  │AZ-1c│  │   │  │
│   │   │   └─────┘  └─────┘  └─────┘  └─────┘  └─────┘  └─────┘  │   │  │
│   │   │                                                          │   │  │
│   │   │   • 6 copies of data across 3 AZs                       │   │  │
│   │   │   • Can lose 2 copies and still write                   │   │  │
│   │   │   • Can lose 3 copies and still read                    │   │  │
│   │   │   • Auto-heals damaged segments                         │   │  │
│   │   │   • Storage auto-scales 10 GB → 128 TB                  │   │  │
│   │   └─────────────────────────────────────────────────────────┘   │  │
│   │                                                                  │  │
│   └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│   AURORA ADVANTAGES:                                                   │
│   • 5x throughput of MySQL, 3x of PostgreSQL                          │
│   • Up to 15 read replicas (vs 5 for RDS MySQL)                       │
│   • Failover in < 30 seconds                                          │
│   • Continuous backup to S3 (no performance impact)                   │
│   • Point-in-time recovery to any second                              │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

Aurora Serverless v2

┌────────────────────────────────────────────────────────────────────────┐
│                    Aurora Serverless v2                                 │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   Auto-scaling capacity based on demand                                │
│                                                                         │
│   Capacity (ACUs)                                                       │
│   128 ┤                                                                │
│   100 ┤              ████████                                          │
│    80 ┤            ██        ██                                        │
│    60 ┤          ██            ██                                      │
│    40 ┤        ██                ██                                    │
│    20 ┤──────██────────────────────██────────────                      │
│       └─────────────────────────────────────────► Time                 │
│                Peak usage period                                        │
│                                                                         │
│   Configuration:                                                        │
│   • Min ACU: 0.5 (can scale to zero with Serverless v2)                │
│   • Max ACU: 128 (choose based on peak needs)                          │
│   • Scales in seconds (not minutes like v1)                            │
│                                                                         │
│   Pricing (us-east-1):                                                 │
│   • $0.12/ACU-hour (Aurora MySQL)                                      │
│   • Storage: $0.10/GB-month                                            │
│   • I/O: $0.20 per million requests                                    │
│                                                                         │
│   Best For:                                                             │
│   ✅ Variable/unpredictable workloads                                  │
│   ✅ Development/test databases                                        │
│   ✅ Multi-tenant SaaS applications                                    │
│   ✅ Infrequently used applications                                    │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

DynamoDB

Fully managed NoSQL database with single-digit millisecond latency at any scale.

DynamoDB Data Model

┌────────────────────────────────────────────────────────────────────────┐
│                    DynamoDB Data Model                                  │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   TABLE: Orders                                                         │
│   ──────────────                                                        │
│                                                                         │
│   Primary Key:                                                          │
│   ┌─────────────────────────────────────────────────────────────────┐  │
│   │  PARTITION KEY (HASH)    │  SORT KEY (RANGE)                    │  │
│   │  Required, determines    │  Optional, enables                   │  │
│   │  data distribution       │  range queries                       │  │
│   │                          │                                       │  │
│   │  customer_id             │  order_date                          │  │
│   │  "CUST#12345"           │  "2024-01-15T10:30:00Z"               │  │
│   └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│   Item (Document):                                                      │
│   ┌─────────────────────────────────────────────────────────────────┐  │
│   │ {                                                                │  │
│   │   "customer_id": "CUST#12345",      // Partition key            │  │
│   │   "order_date": "2024-01-15",       // Sort key                 │  │
│   │   "order_id": "ORD#98765",          // Attribute                │  │
│   │   "total": 299.99,                  // Attribute                │  │
│   │   "items": [                        // Nested list              │  │
│   │     {"sku": "ABC123", "qty": 2},                                │  │
│   │     {"sku": "XYZ789", "qty": 1}                                 │  │
│   │   ],                                                             │  │
│   │   "status": "SHIPPED"                                           │  │
│   │ }                                                                │  │
│   └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│   KEY DESIGN PATTERNS:                                                 │
│   ────────────────────                                                 │
│   1. One-to-Many: Use composite sort key                              │
│      PK: USER#123, SK: ORDER#001, ORDER#002, PROFILE                  │
│                                                                         │
│   2. Many-to-Many: Use GSI with inverted index                        │
│      PK: EMPLOYEE#1, SK: PROJECT#A                                    │
│      GSI: PK: PROJECT#A, SK: EMPLOYEE#1                               │
│                                                                         │
│   3. Hierarchical: Use sort key prefixes                              │
│      SK: COUNTRY#USA#STATE#CA#CITY#LA                                 │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

DynamoDB Operations (Best Practices)

import boto3
from boto3.dynamodb.conditions import Key, Attr
from decimal import Decimal
import json

# Initialize
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('Orders')

# =====================================
# SINGLE ITEM OPERATIONS
# =====================================

def put_item_example():
    """Write an item (creates or replaces)."""
    table.put_item(
        Item={
            'customer_id': 'CUST#12345',
            'order_date': '2024-01-15T10:30:00Z',
            'order_id': 'ORD#98765',
            'total': Decimal('299.99'),  # Use Decimal for numbers
            'status': 'PENDING',
            'items': [
                {'sku': 'ABC123', 'qty': 2, 'price': Decimal('99.99')},
            ]
        },
        # Conditional write - only if doesn't exist
        ConditionExpression='attribute_not_exists(order_id)'
    )

def get_item_example():
    """Read a single item by primary key."""
    response = table.get_item(
        Key={
            'customer_id': 'CUST#12345',
            'order_date': '2024-01-15T10:30:00Z'
        },
        # Strongly consistent read (vs eventually consistent)
        ConsistentRead=True,
        # Only return specific attributes
        ProjectionExpression='order_id, #s, total',
        ExpressionAttributeNames={'#s': 'status'}  # 'status' is reserved
    )
    return response.get('Item')

def update_item_example():
    """Update specific attributes (atomic)."""
    response = table.update_item(
        Key={
            'customer_id': 'CUST#12345',
            'order_date': '2024-01-15T10:30:00Z'
        },
        UpdateExpression='SET #s = :status, updated_at = :now ADD version :inc',
        ExpressionAttributeNames={'#s': 'status'},
        ExpressionAttributeValues={
            ':status': 'SHIPPED',
            ':now': '2024-01-16T14:00:00Z',
            ':inc': 1
        },
        # Optimistic locking
        ConditionExpression='version = :expected_version',
        ReturnValues='ALL_NEW'
    )
    return response['Attributes']

# =====================================
# QUERY (Efficient - uses partition key)
# =====================================

def query_customer_orders(customer_id: str, start_date: str = None):
    """Get all orders for a customer (uses partition key)."""
    key_condition = Key('customer_id').eq(customer_id)
    
    if start_date:
        key_condition = key_condition & Key('order_date').gte(start_date)
    
    response = table.query(
        KeyConditionExpression=key_condition,
        ScanIndexForward=False,  # Descending order (newest first)
        Limit=20,
        # Filter after query (use sparingly - costs RCUs)
        FilterExpression=Attr('status').ne('CANCELLED')
    )
    return response['Items']

# =====================================
# BATCH OPERATIONS
# =====================================

def batch_write_items(items: list):
    """Write up to 25 items in one request."""
    with table.batch_writer() as batch:
        for item in items:
            batch.put_item(Item=item)
    # boto3 handles batching, retries, and unprocessed items

def batch_get_items(keys: list):
    """Read up to 100 items in one request."""
    response = dynamodb.batch_get_item(
        RequestItems={
            'Orders': {
                'Keys': keys,
                'ProjectionExpression': 'order_id, total, status'
            }
        }
    )
    return response['Responses']['Orders']

# =====================================
# TRANSACTIONS (ACID)
# =====================================

def transfer_with_transaction():
    """Atomic multi-item transaction."""
    dynamodb_client = boto3.client('dynamodb')
    
    dynamodb_client.transact_write_items(
        TransactItems=[
            {
                'Update': {
                    'TableName': 'Accounts',
                    'Key': {'account_id': {'S': 'ACC#001'}},
                    'UpdateExpression': 'SET balance = balance - :amount',
                    'ConditionExpression': 'balance >= :amount',
                    'ExpressionAttributeValues': {':amount': {'N': '100'}}
                }
            },
            {
                'Update': {
                    'TableName': 'Accounts',
                    'Key': {'account_id': {'S': 'ACC#002'}},
                    'UpdateExpression': 'SET balance = balance + :amount',
                    'ExpressionAttributeValues': {':amount': {'N': '100'}}
                }
            }
        ]
    )

Global Secondary Indexes (GSI)

┌────────────────────────────────────────────────────────────────────────┐
│                    DynamoDB Indexes                                     │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   BASE TABLE: Orders                                                    │
│   PK: customer_id, SK: order_date                                       │
│                                                                         │
│   Access Pattern: "Get orders by customer" ✅ Query on PK               │
│                                                                         │
│   NEW Access Pattern: "Get orders by status"                            │
│   ❌ Can't query - status is not a key!                                 │
│   ✅ Solution: Create GSI                                               │
│                                                                         │
│   GLOBAL SECONDARY INDEX (GSI):                                         │
│   ─────────────────────────────                                         │
│   ┌─────────────────────────────────────────────────────────────────┐  │
│   │  GSI: orders-by-status                                          │  │
│   │  PK: status      SK: order_date                                 │  │
│   │                                                                  │  │
│   │  Projected Attributes: order_id, customer_id, total             │  │
│   │  (Can project ALL, KEYS_ONLY, or specific attributes)           │  │
│   └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│   # Query GSI                                                           │
│   table.query(                                                          │
│       IndexName='orders-by-status',                                     │
│       KeyConditionExpression=Key('status').eq('PENDING')                │
│   )                                                                     │
│                                                                         │
│   GSI vs LSI:                                                           │
│   ┌──────────────────────────────────────────────────────────────────┐ │
│   │  GSI (Global Secondary)      │  LSI (Local Secondary)           │ │
│   │  ────────────────────────────┼───────────────────────────────── │ │
│   │  Different partition key     │  Same partition key              │ │
│   │  Create anytime              │  Create at table creation only   │ │
│   │  Own RCU/WCU                 │  Shares table RCU/WCU            │ │
│   │  Eventually consistent only  │  Strongly consistent available   │ │
│   │  Up to 20 per table          │  Up to 5 per table              │ │
│   └──────────────────────────────────────────────────────────────────┘ │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

DynamoDB Capacity and Pricing

┌────────────────────────────────────────────────────────────────────────┐
│                    DynamoDB Capacity Modes                              │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   ON-DEMAND MODE:                                                       │
│   ───────────────                                                       │
│   • Pay per request ($1.25 per million writes, $0.25 per million reads)│
│   • No capacity planning                                               │
│   • Scales instantly to any traffic                                    │
│   • Best for: Unpredictable traffic, new applications                  │
│                                                                         │
│   PROVISIONED MODE:                                                     │
│   ─────────────────                                                     │
│   • Pre-allocate RCU (Read Capacity Units) and WCU (Write Capacity)    │
│   • Cheaper at scale (~5-7x cheaper at steady state)                   │
│   • Auto Scaling available                                              │
│   • Reserved Capacity for 1-3 years (up to 70% discount)               │
│                                                                         │
│   CAPACITY UNITS:                                                       │
│   ───────────────                                                       │
│   1 RCU = 1 strongly consistent read/sec (up to 4 KB)                  │
│         = 2 eventually consistent reads/sec (up to 4 KB)               │
│   1 WCU = 1 write/sec (up to 1 KB)                                     │
│                                                                         │
│   Example Calculation:                                                  │
│   ────────────────────                                                  │
│   100 reads/sec × 8 KB items × strongly consistent                     │
│   = 100 × (8 KB / 4 KB) × 1 = 200 RCU                                  │
│                                                                         │
│   50 writes/sec × 3 KB items                                           │
│   = 50 × ceil(3 KB / 1 KB) = 150 WCU                                   │
│                                                                         │
│   COST COMPARISON (us-east-1):                                         │
│   ────────────────────────────                                         │
│   On-Demand: $0.25/million reads, $1.25/million writes                 │
│   Provisioned: $0.00013/RCU-hr, $0.00065/WCU-hr                        │
│                                                                         │
│   Break-even: ~65,000 reads/hr or ~20,000 writes/hr                    │
│   Below this → On-Demand cheaper                                       │
│   Above this → Provisioned cheaper                                     │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

ElastiCache

Managed in-memory caching for sub-millisecond response times: Redis or Memcached.

ElastiCache Architecture

┌────────────────────────────────────────────────────────────────────────┐
│                    ElastiCache Architecture                             │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   APPLICATION CACHING PATTERN:                                         │
│   ────────────────────────────                                         │
│                                                                         │
│   ┌────────────┐                                                       │
│   │   Client   │                                                       │
│   └──────┬─────┘                                                       │
│          │ 1. Request                                                   │
│          ▼                                                              │
│   ┌────────────┐      2. Check cache     ┌─────────────────────┐       │
│   │ Application│ ────────────────────► │   ElastiCache       │       │
│   │   Server   │                        │   (Redis/Memcached) │       │
│   └──────┬─────┘ ◄──────────────────── └─────────────────────┘       │
│          │        3a. Cache HIT                     │                  │
│          │           (return)            3b. Cache MISS                │
│          │                                          │                  │
│          │ 4. Query if miss              ┌──────────┘                  │
│          ▼                               ▼                              │
│   ┌─────────────────────────────────────────────────┐                  │
│   │              Database (RDS/DynamoDB)             │                  │
│   └─────────────────────────────────────────────────┘                  │
│                               │                                         │
│                               │ 5. Return data                          │
│                               ▼                                         │
│   ┌────────────────────────────────────────────────────────────────┐   │
│   │  Application: Store in cache with TTL, return to client        │   │
│   └────────────────────────────────────────────────────────────────┘   │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

Redis vs Memcached Comparison

FeatureRedisMemcached
Data StructuresStrings, Lists, Sets, Sorted Sets, Hashes, StreamsSimple key-value only
PersistenceYes (RDB, AOF)No
ReplicationYes (Multi-AZ)No
Pub/SubYesNo
Lua ScriptingYesNo
Cluster ModeYes (sharding)Yes (sharding)
Multi-threadingSingle-threaded (per shard)Multi-threaded
Use CaseSessions, leaderboards, queuesSimple caching, high throughput

Caching Strategies

import redis
import json
from datetime import timedelta

# Connect to ElastiCache Redis
redis_client = redis.Redis(
    host='my-cluster.abc123.cache.amazonaws.com',
    port=6379,
    ssl=True,
    decode_responses=True
)

# ======================
# CACHE-ASIDE (Lazy Loading)
# ======================
def get_user_with_cache_aside(user_id: str) -> dict:
    """
    Cache-Aside: Application manages cache
    
    Pros: Only requested data cached, cache failures don't break app
    Cons: Cache miss = slow, data can be stale
    """
    cache_key = f"user:{user_id}"
    
    # 1. Try cache first
    cached = redis_client.get(cache_key)
    if cached:
        return json.loads(cached)
    
    # 2. Cache miss - query database
    user = db.query_user(user_id)  # Your DB call
    
    # 3. Store in cache with TTL
    redis_client.setex(
        cache_key,
        timedelta(hours=1),  # TTL
        json.dumps(user)
    )
    
    return user

# ======================
# WRITE-THROUGH
# ======================
def update_user_write_through(user_id: str, data: dict) -> dict:
    """
    Write-Through: Write to cache AND database together
    
    Pros: Cache always consistent, never stale
    Cons: Write latency (two writes), cache churn for unused data
    """
    cache_key = f"user:{user_id}"
    
    # 1. Write to database
    user = db.update_user(user_id, data)
    
    # 2. Write to cache
    redis_client.setex(
        cache_key,
        timedelta(hours=1),
        json.dumps(user)
    )
    
    return user

# ======================
# WRITE-BEHIND (Write-Back)
# ======================
def update_user_write_behind(user_id: str, data: dict) -> dict:
    """
    Write-Behind: Write to cache, async write to database
    
    Pros: Fast writes, reduced database load
    Cons: Data loss risk, complex implementation
    """
    cache_key = f"user:{user_id}"
    
    # 1. Write to cache immediately
    user = {**get_user(user_id), **data}
    redis_client.setex(cache_key, timedelta(hours=1), json.dumps(user))
    
    # 2. Queue database write (async)
    write_queue.send({'user_id': user_id, 'data': data})
    
    return user

# ======================
# CACHE INVALIDATION
# ======================
def invalidate_user_cache(user_id: str):
    """Delete cache entry when data changes."""
    redis_client.delete(f"user:{user_id}")

def invalidate_pattern(pattern: str):
    """Delete all keys matching pattern (use with caution)."""
    cursor = 0
    while True:
        cursor, keys = redis_client.scan(cursor, match=pattern, count=100)
        if keys:
            redis_client.delete(*keys)
        if cursor == 0:
            break

🎯 Interview Questions

S3 (Object Storage):
  • Static files, backups, data lakes
  • HTTP access from anywhere
  • Unlimited storage, 11 9s durability
EBS (Block Storage):
  • EC2 boot volumes, databases
  • Single EC2, single AZ
  • Low latency (sub-ms), resizable
EFS (File Storage):
  • Shared file systems across EC2
  • Multi-AZ, POSIX compliant
  • Auto-scaling, Linux only
Decision Matrix:
  • Need HTTP access globally → S3
  • Need database storage → EBS
  • Need shared NFS mount → EFS
Access Patterns:
  1. Get user profile
  2. Get user’s posts (sorted by date)
  3. Get post’s comments
  4. Get user’s followers
  5. Get posts by hashtag
Single Table Design:
PK: USER#123, SK: PROFILE       → User profile
PK: USER#123, SK: POST#2024...  → User's posts
PK: POST#456, SK: COMMENT#...   → Post comments
PK: USER#123, SK: FOLLOWS#789   → Followers

GSI1: hashtag, created_at       → Posts by hashtag
GSI2: post_id, created_at       → Comments on post
Key Points:
  • Denormalize for read performance
  • Use composite sort keys
  • Create GSIs for access patterns
  • Use sparse indexes where appropriate
Multi-AZ (High Availability):
  • Purpose: Disaster recovery
  • Synchronous replication
  • Automatic failover (60-120s)
  • Standby NOT readable
  • Same region only
Read Replicas (Read Scaling):
  • Purpose: Performance/offload reads
  • Asynchronous replication
  • Manual promotion (becomes primary)
  • Replicas ARE readable
  • Can be cross-region
Best Practice: Use BOTH
  • Multi-AZ for production HA
  • Read replicas for read scaling
Strategies:
  1. TTL-Based (Simplest)
    • Set expiration on cache entries
    • Accept eventual consistency
  2. Event-Driven
    • Publish events on data changes
    • Consumers invalidate cache
  3. Write-Through
    • Update cache on every write
    • Never stale, but slower writes
  4. Cache-Aside with Versioning
    • Include version in cache key
    • Bump version on update
Code Pattern:
def update_product(product_id, data):
    # Update database
    db.update(product_id, data)
    # Invalidate cache
    cache.delete(f"product:{product_id}")
    # Or publish event
    sns.publish(topic, {"product_id": product_id})
Strategy: Tiered Storage with Lifecycle
  1. Hot Data (Recent 30 days): S3 Standard
  2. Warm Data (30-90 days): S3 Standard-IA
  3. Cold Data (90+ days): S3 Glacier
  4. Archive (1+ year): S3 Glacier Deep Archive
Lifecycle Policy:
{
  "Rules": [{
    "Transitions": [
      {"Days": 30, "StorageClass": "STANDARD_IA"},
      {"Days": 90, "StorageClass": "GLACIER"},
      {"Days": 365, "StorageClass": "DEEP_ARCHIVE"}
    ],
    "Expiration": {"Days": 2555}
  }]
}
Cost Estimate (100 TB, us-east-1):
  • All Standard: $2,300/month
  • With lifecycle: ~$200/month (91% savings!)

🧪 Hands-On Lab: Build a Caching Layer

Objective: Add Redis caching to reduce database load by 80%
1

Create ElastiCache Redis Cluster

Use t3.micro for testing, enable encryption in transit
2

Modify Security Groups

Allow port 6379 from application servers
3

Implement Cache-Aside Pattern

Add caching layer to your application
4

Add TTL and Invalidation

Set appropriate TTLs, invalidate on writes
5

Monitor with CloudWatch

Track cache hit ratio, memory usage

Storage Comparison Summary

ServiceTypeDurabilityLatencyBest For
S3Object11 9s~100msFiles, backups, static sites
EBSBlock99.999%sub-msEC2 volumes, databases
EFSFile11 9s~msShared file systems
DynamoDBNoSQL11 9ssub-10msKey-value, high scale
RDS/AuroraSQL99.95%~msRelational, complex queries
ElastiCacheIn-memoryN/Asub-msCaching, sessions

Next Module

Networking

Master VPC, subnets, security groups, and load balancers