Performance & Optimization

Overview
Performance Metrics
Caching Strategies
Cache Levels
Caching Patterns
Cache Invalidation Strategies
Database Optimization
Query Optimization
Indexing Strategy
Connection Pooling
Scaling Strategies
Vertical vs Horizontal Scaling
Load Balancing Algorithms
Database Scaling
Profiling & Benchmarking
Application Profiling
Load Testing
N+1 Query Problem
Async Processing
Background Jobs with Celery
Event-Driven with Message Queues
Frontend Performance
Critical Rendering Path
Optimization Techniques
Bundle Optimization
Database Performance
Query Optimization Checklist
Slow Query Analysis
Connection Pool Sizing
Application-Level Optimization
Async I/O
Efficient Data Structures
Performance Testing Strategy
Types of Tests
Key Metrics to Track
Optimization Checklist
Quick Reference
Latency Targets
Capacity Planning Formula

Overview

Performance optimization is about making systems faster and more efficient. The key is measuring first, then optimizing the right things.

Performance Metrics

Latency

Time to complete one request
Measure: p50, p95, p99
Target: < 200ms for web APIs

Throughput

Requests per second (RPS)
Transactions per second (TPS)
Target: Depends on scale

Availability

Uptime percentage
99.9% = 8.76 hours downtime/year
99.99% = 52 minutes/year

Resource Usage

CPU, Memory, Disk, Network
Cost efficiency
Bottleneck identification

Caching Strategies

Cache Levels

┌─────────────────────────────────────────────────────────┐
│                    Client                               │
│  ┌─────────────────────────────────────────────────┐   │
│  │              Browser Cache                      │   │
│  │         (Static assets, API responses)          │   │
│  └─────────────────────────────────────────────────┘   │
└────────────────────────┬────────────────────────────────┘
                         │
┌────────────────────────┼────────────────────────────────┐
│                        ▼                                │
│  ┌─────────────────────────────────────────────────┐   │
│  │                   CDN                           │   │
│  │      (Edge caching, global distribution)        │   │
│  └─────────────────────────────────────────────────┘   │
│                        │                                │
│                        ▼                                │
│  ┌─────────────────────────────────────────────────┐   │
│  │            Application Cache                    │   │
│  │         (Redis, Memcached - in-memory)          │   │
│  └─────────────────────────────────────────────────┘   │
│                        │                                │
│                        ▼                                │
│  ┌─────────────────────────────────────────────────┐   │
│  │               Database                          │   │
│  │      (Query cache, buffer pool)                 │   │
│  └─────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────┘

Caching Patterns

# Cache-Aside (Lazy Loading)
def get_user(user_id):
    # 1. Check cache first
    cached = redis.get(f"user:{user_id}")
    if cached:
        return json.loads(cached)
    
    # 2. Cache miss - fetch from DB
    user = db.query("SELECT * FROM users WHERE id = ?", user_id)
    
    # 3. Store in cache for next time
    redis.setex(f"user:{user_id}", 3600, json.dumps(user))
    
    return user

# Write-Through
def update_user(user_id, data):
    # 1. Update database
    db.update("UPDATE users SET ... WHERE id = ?", data, user_id)
    
    # 2. Update cache immediately
    redis.setex(f"user:{user_id}", 3600, json.dumps(data))

# Cache Invalidation
def delete_user(user_id):
    db.delete("DELETE FROM users WHERE id = ?", user_id)
    redis.delete(f"user:{user_id}")

Cache Invalidation Strategies

Strategy	Description	Use When
TTL (Time-to-Live)	Auto-expire after time	Acceptable staleness
Event-based	Invalidate on update	Real-time consistency
Version tags	Change key on update	Immutable objects

Database Optimization

Query Optimization

-- ❌ Slow: Full table scan
SELECT * FROM orders WHERE YEAR(created_at) = 2024;

-- ✅ Fast: Use index-friendly query
SELECT * FROM orders 
WHERE created_at >= '2024-01-01' 
  AND created_at < '2025-01-01';

-- ❌ Slow: SELECT *
SELECT * FROM users WHERE id = 1;

-- ✅ Fast: Select only needed columns
SELECT id, name, email FROM users WHERE id = 1;

-- Explain query plan
EXPLAIN ANALYZE SELECT * FROM orders WHERE user_id = 123;

Indexing Strategy

-- Single column index
CREATE INDEX idx_users_email ON users(email);

-- Composite index (order matters!)
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
-- This index helps: WHERE user_id = 1 AND status = 'pending'
-- Also helps: WHERE user_id = 1
-- Does NOT help: WHERE status = 'pending'

-- Covering index (includes all needed columns)
CREATE INDEX idx_users_covering ON users(email) INCLUDE (name, created_at);

Connection Pooling

# ❌ Bad: New connection per request
def get_user(user_id):
    conn = psycopg2.connect(...)  # Expensive!
    result = conn.execute(query)
    conn.close()
    return result

# ✅ Good: Connection pool
from sqlalchemy import create_engine

engine = create_engine(
    DATABASE_URL,
    pool_size=20,
    max_overflow=10,
    pool_timeout=30
)

Scaling Strategies

Vertical vs Horizontal Scaling

Vertical Scaling (Scale Up)     Horizontal Scaling (Scale Out)
┌─────────────────────┐         ┌──────┐ ┌──────┐ ┌──────┐
│                     │         │Server│ │Server│ │Server│
│    Bigger Server    │         │  1   │ │  2   │ │  3   │
│                     │         └──────┘ └──────┘ └──────┘
│  More CPU, RAM      │              │       │       │
│                     │              └───────┼───────┘
└─────────────────────┘                      │
                                    ┌────────┴────────┐
                                    │  Load Balancer  │
                                    └─────────────────┘

Load Balancing Algorithms

Algorithm	Description	Best For
Round Robin	Rotate through servers	Equal server capacity
Least Connections	Send to server with fewest connections	Variable request duration
IP Hash	Same client → same server	Session affinity
Weighted	Distribute based on capacity	Mixed server sizes

Database Scaling

                     Read Replicas
                          │
    ┌─────────────────────┼─────────────────────┐
    │                     │                     │
    ▼                     ▼                     ▼
┌────────┐          ┌────────┐           ┌────────┐
│Replica │          │Replica │           │Replica │
│   1    │          │   2    │           │   3    │
└────────┘          └────────┘           └────────┘
    ▲                     ▲                     ▲
    │                     │                     │
    └─────────────────────┴─────────────────────┘
                          │
                    Replication
                          │
                    ┌─────────┐
                    │ Primary │ ◄── All writes
                    │   DB    │
                    └─────────┘

Profiling & Benchmarking

Application Profiling

# Python profiling
import cProfile
import pstats

def profile_function(func):
    profiler = cProfile.Profile()
    profiler.enable()
    
    result = func()
    
    profiler.disable()
    stats = pstats.Stats(profiler)
    stats.sort_stats('cumulative')
    stats.print_stats(10)  # Top 10 slowest
    
    return result

# Memory profiling
from memory_profiler import profile

@profile
def memory_intensive_function():
    large_list = [i for i in range(1000000)]
    return sum(large_list)

Load Testing

# Using locust for load testing
from locust import HttpUser, task, between

class WebsiteUser(HttpUser):
    wait_time = between(1, 5)
    
    @task(3)
    def view_homepage(self):
        self.client.get("/")
    
    @task(1)
    def view_product(self):
        self.client.get("/products/1")
    
    @task(1)
    def create_order(self):
        self.client.post("/orders", json={
            "product_id": 1,
            "quantity": 2
        })

N+1 Query Problem

One of the most common performance issues in applications.

# ❌ N+1 Problem: 1 query for orders + N queries for users
orders = Order.objects.all()  # 1 query
for order in orders:
    print(order.user.name)    # N queries (one per order)

# ✅ Eager Loading: 2 queries total
orders = Order.objects.select_related('user').all()
for order in orders:
    print(order.user.name)    # No additional queries

# For many-to-many relationships
orders = Order.objects.prefetch_related('items').all()

Async Processing

Background Jobs with Celery

from celery import Celery

app = Celery('tasks', broker='redis://localhost:6379')

@app.task
def send_email(user_id, template):
    user = get_user(user_id)
    email_service.send(user.email, template)

@app.task
def generate_report(report_id):
    data = fetch_large_dataset()
    report = process_data(data)
    save_report(report_id, report)

# Usage - non-blocking
@api.post("/orders")
def create_order(order_data):
    order = save_order(order_data)
    
    # Queue async tasks instead of blocking
    send_email.delay(order.user_id, "order_confirmation")
    generate_report.delay(order.id)
    
    return {"order_id": order.id}  # Return immediately

Event-Driven with Message Queues

import aio_pika
import asyncio

async def publish_event(event_type: str, data: dict):
    connection = await aio_pika.connect_robust("amqp://localhost/")
    async with connection:
        channel = await connection.channel()
        await channel.default_exchange.publish(
            aio_pika.Message(body=json.dumps(data).encode()),
            routing_key=event_type,
        )

async def consume_events(queue_name: str, handler):
    connection = await aio_pika.connect_robust("amqp://localhost/")
    async with connection:
        channel = await connection.channel()
        queue = await channel.declare_queue(queue_name)
        async for message in queue:
            async with message.process():
                await handler(json.loads(message.body))

Frontend Performance

Critical Rendering Path

HTML ──► DOM ──┐
              ├──► Render Tree ──► Layout ──► Paint
CSS ──► CSSOM ─┘

Optimization Techniques

<!-- Defer non-critical JavaScript -->
<script src="analytics.js" defer></script>

<!-- Preload critical resources -->
<link rel="preload" href="critical.css" as="style">
<link rel="preload" href="hero-image.webp" as="image">

<!-- Lazy load images -->
<img src="placeholder.jpg" data-src="actual-image.jpg" loading="lazy">

<!-- Use modern image formats -->
<picture>
  <source srcset="image.avif" type="image/avif">
  <source srcset="image.webp" type="image/webp">
  <img src="image.jpg" alt="Description">
</picture>

Bundle Optimization

// Code splitting with dynamic imports
const HeavyComponent = lazy(() => import('./HeavyComponent'));

// Tree shaking - import only what you need
import { debounce } from 'lodash-es';  // ✅ Tree-shakeable
import _ from 'lodash';                 // ❌ Imports everything

Database Performance

Query Optimization Checklist

Issue	Symptom	Solution
Missing index	Slow queries, full table scans	Add appropriate index
Too many indexes	Slow writes	Remove unused indexes
N+1 queries	Many similar queries	Use eager loading/JOINs
SELECT *	Fetching unused data	Select only needed columns
Large result sets	High memory usage	Pagination, streaming
Lock contention	Timeouts, deadlocks	Reduce transaction scope

Slow Query Analysis

-- PostgreSQL: Enable slow query log
ALTER SYSTEM SET log_min_duration_statement = 1000;  -- Log queries > 1s

-- Analyze query execution plan
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders 
WHERE user_id = 123 
ORDER BY created_at DESC 
LIMIT 10;

-- Look for:
-- - Seq Scan (table scan, consider index)
-- - High actual time
-- - Large rows removed by filter
-- - Sort operations on large datasets

Connection Pool Sizing

# Rule of thumb: connections = (core_count * 2) + effective_spindle_count
# For SSD: connections ≈ (cores * 2) + 1

from sqlalchemy import create_engine

engine = create_engine(
    DATABASE_URL,
    pool_size=20,           # Base pool size
    max_overflow=10,        # Extra connections allowed
    pool_timeout=30,        # Wait time for connection
    pool_recycle=1800,      # Recycle connections after 30 min
    pool_pre_ping=True,     # Test connection before use
)

Application-Level Optimization

Async I/O

import asyncio
import aiohttp

# ❌ Sequential - slow
def fetch_all_sequential(urls):
    results = []
    for url in urls:
        response = requests.get(url)
        results.append(response.json())
    return results  # Takes N * avg_response_time

# ✅ Concurrent - fast
async def fetch_all_concurrent(urls):
    async with aiohttp.ClientSession() as session:
        tasks = [fetch(session, url) for url in urls]
        return await asyncio.gather(*tasks)  # Takes max(response_times)

async def fetch(session, url):
    async with session.get(url) as response:
        return await response.json()

Efficient Data Structures

# Use appropriate data structures
from collections import deque, defaultdict
from functools import lru_cache

# O(1) append/pop from both ends
queue = deque(maxlen=1000)  # Bounded queue

# Avoid repeated dictionary key checks
counts = defaultdict(int)
for item in items:
    counts[item] += 1  # No KeyError

# Memoization for expensive computations
@lru_cache(maxsize=1000)
def expensive_calculation(n):
    return fibonacci(n)

Performance Testing Strategy

Types of Tests

Test Type	Purpose	Tools
Load Testing	Normal expected load	Locust, k6, JMeter
Stress Testing	Beyond normal capacity	Same tools, higher load
Spike Testing	Sudden traffic spikes	Simulate flash sales
Soak Testing	Extended period (memory leaks)	Run for hours/days
Breakpoint Testing	Find system limits	Gradually increase load

Key Metrics to Track

# Response time percentiles
p50 = 100ms   # Median - most users
p95 = 500ms   # 95% of requests faster than this
p99 = 1000ms  # Tail latency - worst 1%

# Throughput
rps = 10000   # Requests per second

# Error rate
error_rate = errors / total_requests  # Should be < 1%

# Saturation
cpu_usage = 70%      # Alert at 80%
memory_usage = 60%   # Alert at 75%

Optimization Checklist

Measure First

Never optimize without data. Profile your application, identify bottlenecks using APM tools (New Relic, Datadog, Jaeger).

Cache Aggressively

Add caching at every level - browser, CDN, application, database. Use cache-aside pattern with appropriate TTLs.

Optimize Queries

Use EXPLAIN ANALYZE, add proper indexes, avoid N+1 queries, use connection pooling.

Use Async

Don’t block on I/O. Use async/await, message queues, background jobs for long-running tasks.

Optimize Frontend

Minimize bundle size, lazy load, use CDN, optimize images, implement proper caching headers.

Scale Appropriately

Start with vertical scaling (simpler), move to horizontal when needed. Use auto-scaling.

Quick Reference

Latency Targets

Operation	Good	Acceptable	Poor
Page load	< 1s	< 3s	> 5s
API response	< 100ms	< 500ms	> 1s
Database query	< 10ms	< 100ms	> 500ms
Cache lookup	< 1ms	< 10ms	> 50ms

Capacity Planning Formula

Required Capacity = (Peak RPS × Avg Response Time) / Utilization Target

Example:
- Peak: 10,000 RPS
- Avg response: 100ms
- Target utilization: 70%

Capacity = (10000 × 0.1) / 0.7 = 1,429 concurrent connections needed

Remember: “Premature optimization is the root of all evil” - Donald Knuth. Focus on clean code first, then optimize the actual bottlenecks. Always measure before and after optimization.

Common Mistake: Optimizing based on assumptions. Always profile first, optimize the actual hot paths, and verify improvements with benchmarks.

Security Fundamentals Overview

Overview

Testing & Code Quality

Crash Courses

AI Engineering

Math for ML - Understanding Linear Algebra

Probability & Statistics for ML

Math for ML - Understanding Calculus

ML Mastery

Deep Learning Mastery

NestJS Mastery

Microservices Mastery

Low Level Design

OOP Concepts

SOLID Principles

Design Patterns

LLD Case Studies

System Design (HLD)

Senior Level (L5+/Staff)

HLD Case Studies

Engineering Fundamentals

DevOps & Operations

Azure Cloud Engineering

AWS Cloud

AWS Monitoring & Observability

AWS Security Services

AWS Serverless

AWS Operations

AWS Advanced

AWS Case Studies

GCP Cloud Engineering

DevOps Tools

Database Engineering

HIPAA Compliance Mastery

Operating Systems

Linux Internals

Distributed Systems

Networking Mastery

Build Your Own X

Go Lang Mastery

C Programming

Classic Research Papers

Distributed System Tools

​Overview

​Performance Metrics

Latency

Throughput

Availability

Resource Usage

​Caching Strategies

​Cache Levels

​Caching Patterns

​Cache Invalidation Strategies

​Database Optimization

​Query Optimization

​Indexing Strategy

​Connection Pooling

​Scaling Strategies

​Vertical vs Horizontal Scaling

​Load Balancing Algorithms

​Database Scaling

​Profiling & Benchmarking

​Application Profiling

​Load Testing

​N+1 Query Problem

​Async Processing

​Background Jobs with Celery

​Event-Driven with Message Queues

​Frontend Performance

​Critical Rendering Path

​Optimization Techniques

​Bundle Optimization

​Database Performance

​Query Optimization Checklist

​Slow Query Analysis

​Connection Pool Sizing

​Application-Level Optimization

​Async I/O

​Efficient Data Structures

​Performance Testing Strategy

​Types of Tests

Overview

Performance Metrics

Caching Strategies

Cache Levels

Caching Patterns

Cache Invalidation Strategies

Database Optimization

Query Optimization

Indexing Strategy

Connection Pooling

Scaling Strategies

Vertical vs Horizontal Scaling

Load Balancing Algorithms

Database Scaling

Profiling & Benchmarking

Application Profiling

Load Testing

N+1 Query Problem

Async Processing

Background Jobs with Celery

Event-Driven with Message Queues

Frontend Performance

Critical Rendering Path

Optimization Techniques

Bundle Optimization

Database Performance

Query Optimization Checklist

Slow Query Analysis

Connection Pool Sizing

Application-Level Optimization

Async I/O

Efficient Data Structures

Performance Testing Strategy

Types of Tests

Key Metrics to Track

Optimization Checklist

Quick Reference

Latency Targets

Capacity Planning Formula