Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

Module 13: Load Balancing & Proxies

As applications scale, single servers can’t handle the load. Load balancers distribute traffic across multiple servers, while proxies act as intermediaries for various purposes. This module covers both in depth. Think of a load balancer like a host at a busy restaurant. When customers (requests) arrive, the host does not send everyone to the same waiter (server). They distribute customers across all available waiters based on who is least busy, who can handle the largest parties, or simply in rotation. Without the host, all customers would crowd around one waiter while the others stood idle.
Load Balancer Architecture
Estimated Time: 3-4 hours
Difficulty: Intermediate
Prerequisites: Modules 4-6 (Network fundamentals)

13.1 Why Load Balancing?

The Single Server Problem

                    100,000 users


                  ┌──────────────┐
                  │    Server    │  ← Overwhelmed!
                  │  (1 CPU,     │     - High latency
                  │   8GB RAM)   │     - Requests timeout
                  └──────────────┘     - Single point of failure

The Solution: Distribute Load

                    100,000 users


                  ┌──────────────┐
                  │    Load      │
                  │   Balancer   │
                  └──────┬───────┘
            ┌────────────┼────────────┐
            │            │            │
            ▼            ▼            ▼
      ┌──────────┐ ┌──────────┐ ┌──────────┐
      │ Server 1 │ │ Server 2 │ │ Server 3 │
      │ ~33,333  │ │ ~33,333  │ │ ~33,333  │
      │  users   │ │  users   │ │  users   │
      └──────────┘ └──────────┘ └──────────┘

13.2 Load Balancing Algorithms

1. Round Robin

Requests are distributed sequentially.
Request 1 → Server A
Request 2 → Server B
Request 3 → Server C
Request 4 → Server A  ← Cycle repeats
Request 5 → Server B
...
Pros: Simple, fair distribution Cons: Doesn’t consider server load or capacity

2. Weighted Round Robin

Servers with higher capacity get more requests.
Server A (weight 3): Gets 3 out of 6 requests
Server B (weight 2): Gets 2 out of 6 requests
Server C (weight 1): Gets 1 out of 6 requests

Sequence: A, A, A, B, B, C, A, A, A, B, B, C...

3. Least Connections

Send to server with fewest active connections.
Server A: 150 connections
Server B: 50 connections  ← Next request goes here
Server C: 200 connections
Best for: Long-lived connections (WebSockets, database connections)

4. Least Response Time

Send to server with fastest response time + fewest connections.
Server A: 50ms avg, 100 connections
Server B: 30ms avg, 120 connections  ← Faster, preferred
Server C: 80ms avg, 80 connections

5. IP Hash

Same client IP always goes to same server.
hash(client_ip) % num_servers = target_server

IP 192.168.1.10 → hash → always Server B
IP 192.168.1.11 → hash → always Server A
Use Case: Session persistence without sticky sessions

6. Least Bandwidth

Route to server currently serving least traffic (Mbps).

Algorithm Comparison

AlgorithmBest ForDrawback
Round RobinSimple, equal serversIgnores load
Weighted RRMixed capacityStatic weights
Least ConnectionsLong connectionsOverhead tracking
Least Response TimePerformance-criticalNeeds health checks
IP HashSession stickinessUneven distribution

13.3 Layer 4 vs Layer 7 Load Balancing

Layer 4 (Transport Layer)

Operates on TCP/UDP level. Sees IP addresses and ports only. It does not terminate the TCP connection or inspect the payload — it simply forwards packets based on IP and port information.
Client ──TCP──→ Load Balancer ──TCP──→ Server

         Sees: Source IP, Dest IP
               Source Port, Dest Port
         Cannot see: HTTP headers, URLs, cookies
Pros: Very fast (no packet inspection), simple, handles any TCP/UDP protocol (not just HTTP) Cons: Limited routing options, no content-based decisions, cannot do SSL termination Analogy: A Layer 4 LB is like a highway toll booth that reads license plates and directs cars to different lanes, but never looks inside the car.

Layer 7 (Application Layer)

Operates on HTTP/HTTPS level. The load balancer terminates the client’s TCP connection, reads the full HTTP request, makes a routing decision, and then opens a new TCP connection to the backend server.
Client ──HTTP──→ Load Balancer ──HTTP──→ Server

          Sees: Everything!
                - URL path (/api/users)
                - Headers (Host, Cookie, Auth)
                - HTTP method (GET, POST)
                - Query parameters
                - Request body
Pros: Smart routing (route /api/* to API servers, /static/* to CDN), SSL termination, caching, compression, header injection (X-Forwarded-For), A/B testing Cons: More processing overhead, higher latency (must parse HTTP), only works for HTTP/HTTPS Analogy: A Layer 7 LB is like a hotel concierge who reads your request, understands what you need, and directs you to the right department — they actually open and read the letter.
Which should you use? Default to Layer 7 for web applications — the routing flexibility and SSL termination are worth the small overhead. Use Layer 4 for non-HTTP protocols (databases, game servers, custom TCP services) or when you need maximum throughput and do not need content-based routing.

When to Use Which

ScenarioLayer
Simple TCP load distributionL4
Route by URL pathL7
SSL terminationL7
Gaming servers, databasesL4
Microservices routingL7
A/B testingL7

13.4 Proxy Types

Forward Proxy

Sits between clients and the internet. Clients know about it.
┌─────────────────────────────────────────────────────┐
│                    Corporate Network                │
│                                                     │
│  ┌────────┐                                        │
│  │Client A│──┐                                     │
│  └────────┘  │     ┌──────────┐                   │
│              ├────►│  Forward │                   │
│  ┌────────┐  │     │  Proxy   │────► Internet    │
│  │Client B│──┘     └──────────┘                   │
│  └────────┘                                        │
└─────────────────────────────────────────────────────┘
Use Cases:
  • Content filtering (block certain sites)
  • Caching (reduce bandwidth)
  • Anonymity (hide client IPs)
  • Access control (authentication)

Reverse Proxy

Sits between internet and servers. Clients don’t know about it.
                        ┌─────────────────────────────┐
                        │       Data Center           │
                        │                             │
Internet ────►  ┌───────────────┐   ┌────────────┐   │
                │    Reverse    │──►│  Server A  │   │
                │     Proxy     │   └────────────┘   │
                │               │   ┌────────────┐   │
                │  (nginx, etc) │──►│  Server B  │   │
                └───────────────┘   └────────────┘   │
                        │                             │
                        └─────────────────────────────┘
Use Cases:
  • Load balancing
  • SSL termination
  • Caching static content
  • DDoS protection
  • Compression
  • URL rewriting

13.5 SSL/TLS Termination

The Problem

Encryption is CPU-intensive. Every server doing SSL:
Client ──HTTPS──► LB ──HTTPS──► Server 1 (decrypts)
                     ──HTTPS──► Server 2 (decrypts)
                     ──HTTPS──► Server 3 (decrypts)

Each server needs:
- SSL certificates
- CPU for encryption/decryption

The Solution: SSL Termination

Load balancer handles all SSL:
Client ──HTTPS──► Load Balancer ──HTTP──► Server 1
         (encrypted)  (decrypts)   (plain)   Server 2
                                            Server 3
Benefits:
  • Servers freed from SSL overhead
  • Single place to manage certificates
  • Easier SSL certificate renewal

SSL Passthrough vs Termination vs Re-encryption

PASSTHROUGH (L4):
Client ──HTTPS──► LB ──HTTPS──► Server
                  (just forwards, doesn't decrypt)

TERMINATION:
Client ──HTTPS──► LB ──HTTP──► Server
                  (decrypts, internal is plain)

RE-ENCRYPTION:
Client ──HTTPS──► LB ──HTTPS──► Server
                  (decrypts, re-encrypts for backend)
When to use which: Passthrough is for when compliance requires end-to-end encryption and the LB must not see plaintext — but you lose L7 routing. Termination is the most common choice: it lets the LB do smart routing and offloads CPU from backends, and internal traffic is usually on a trusted private network. Re-encryption is the compromise: the LB can still inspect and route traffic, but internal traffic is also encrypted — required by some compliance regimes (PCI DSS, HIPAA) even within private networks.
Troubleshooting load balancer issues: If your LB returns 502 (Bad Gateway), the LB is healthy but cannot reach the backend — check backend health, security groups, and whether the backend is listening on the expected port. If it returns 503 (Service Unavailable), all backends have failed health checks. If you see uneven traffic distribution, verify your health check configuration — a backend that is slow but technically “healthy” will keep receiving traffic. Use curl -H "Host: your-domain.com" http://LB-IP to test directly against the LB, bypassing DNS.

13.6 Session Persistence (Sticky Sessions)

Some applications need the same user to always reach the same server.

Why Needed?

Request 1: Login to Server A (session created in memory)
Request 2: Goes to Server B (no session! User logged out!)

Sticky Session Methods

1. Cookie-based:
Load balancer sets cookie: SERVERID=server-a
All requests with this cookie → Server A
2. IP Hash:
Same IP always routes to same server
(breaks with shared IPs, NAT)
3. Application Cookie:
App sets session cookie
LB reads it and routes accordingly

Better Alternative: Shared Session Store

┌───────────────────────────────────────────────────┐
│                  Load Balancer                    │
│                (no sticky sessions)               │
└───────────────────────┬───────────────────────────┘
          ┌─────────────┼─────────────┐
          │             │             │
          ▼             ▼             ▼
     ┌─────────┐   ┌─────────┐   ┌─────────┐
     │Server A │   │Server B │   │Server C │
     └────┬────┘   └────┬────┘   └────┬────┘
          │             │             │
          └─────────────┼─────────────┘

                 ┌─────────────┐
                 │    Redis    │
                 │  (sessions) │
                 └─────────────┘
All servers share sessions — any server can handle any request. This is the preferred pattern because it makes your backend truly stateless and horizontally scalable. If a server dies, no sessions are lost.
Best practice: Design your applications to be stateless from the start. Store sessions in Redis, DynamoDB, or a similar external store. This eliminates the need for sticky sessions entirely and makes scaling, deployment, and failover dramatically simpler. Sticky sessions are a crutch for stateful applications — they work, but they limit your ability to scale and create uneven load distribution.

13.7 Health Checks

Load balancers must know which servers are healthy.

Passive Health Checks

Monitor real traffic for failures:
Server A: 5 consecutive 500 errors → mark unhealthy
Server A: 10 successful responses → mark healthy again

Active Health Checks

Periodically probe servers:
Every 5 seconds:
  GET /health → 200 OK? Server is healthy
  GET /health → timeout/500? Server is unhealthy

Health Check Configuration

# Nginx example
upstream backend {
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
    server 10.0.1.3:8080;
    
    # Health check parameters
    health_check interval=5s fails=3 passes=2;
}

Health Check Endpoints

Your application should expose health endpoints. Think of these like a doctor’s checkup — each endpoint tests something different:
/health        -> Simple "I'm alive" (heartbeat -- am I running at all?)
/health/ready  -> "I'm ready to serve traffic" (can I handle requests? DB connected? Cache warm?)
/health/live   -> "I'm running (not deadlocked)" (is my process responsive, even if dependencies are down?)
The distinction between liveness and readiness matters in Kubernetes especially. A pod that fails the liveness check gets restarted (the process is stuck). A pod that fails the readiness check stops receiving traffic but is not restarted (it might be warming up or waiting for a dependency). Confusing these two causes either unnecessary restarts or traffic being sent to pods that cannot handle it.
// /health response
{
  "status": "healthy",
  "checks": {
    "database": "connected",
    "cache": "connected",
    "disk": "ok"
  },
  "version": "1.2.3"
}

13.8 Common Load Balancer Products

Cloud Load Balancers

ProviderL4L7Global
AWSNLBALBGlobal Accelerator
GCPNetwork LBHTTP(S) LBCloud CDN
AzureLoad BalancerApplication GatewayFront Door

Software Load Balancers

ProductTypeUse Case
NginxL7Web apps, reverse proxy
HAProxyL4/L7High-performance, TCP/HTTP
TraefikL7Kubernetes, microservices
EnvoyL7Service mesh, cloud-native

Hardware Load Balancers

VendorProducts
F5BIG-IP
CitrixADC/NetScaler
A10Thunder

13.9 Load Balancing Patterns

Blue-Green Deployment

Active (Blue):  100% traffic → v1.0
Standby (Green): 0% traffic → v2.0 (testing)

Switch:
Blue:   0% traffic
Green: 100% traffic → v2.0 (now live)

Canary Deployment

Main:   95% traffic → v1.0
Canary:  5% traffic → v2.0 (testing with real users)

Gradually increase canary:
Main:   90% → 50% → 10% → 0%
Canary: 10% → 50% → 90% → 100%

A/B Testing

Route based on user attributes:
User in test group A → Server with feature X
User in test group B → Server without feature X

Analyze which performs better

13.10 Reverse Proxy Use Cases

URL Rewriting

External:                    Internal:
/api/users      ──────►      /v2/users-service/list
/api/products   ──────►      /product-catalog/all

Path-Based Routing

# Nginx config
location /api/ {
    proxy_pass http://api-servers;
}

location /static/ {
    proxy_pass http://cdn-servers;
}

location / {
    proxy_pass http://web-servers;
}

Header-Based Routing

Host: api.example.com    → API servers
Host: www.example.com    → Web servers
Host: admin.example.com  → Admin servers

Caching

┌────────┐      ┌───────────────┐      ┌────────┐
│ Client │ ──►  │ Reverse Proxy │ ──►  │ Server │
└────────┘      │   (cached)    │      └────────┘
                └───────────────┘
                
First request: Miss, fetch from server, cache
Second request: Hit, serve from cache (no server hit)

13.11 Real-World Architecture

Typical Web Application

┌─────────────────────────────────────────────────────────────┐
│                                                             │
│  User → CDN → WAF → Load Balancer → Web Servers            │
│                           │                                 │
│                           ├──────────► API Servers         │
│                           │                │                │
│                           │                ▼                │
│                           │         ┌──────────────┐       │
│                           │         │   Database   │       │
│                           │         │    (Redis)   │       │
│                           │         └──────────────┘       │
│                           │                                 │
└─────────────────────────────────────────────────────────────┘

Microservices Architecture

                    ┌────────────────┐
                    │  API Gateway   │
                    │  (L7 routing)  │
                    └───────┬────────┘
       ┌────────────────────┼────────────────────┐
       │                    │                    │
       ▼                    ▼                    ▼
┌─────────────┐      ┌─────────────┐      ┌─────────────┐
│ User Service│      │Order Service│      │Product Svc  │
│     LB      │      │     LB      │      │     LB      │
└─────────────┘      └─────────────┘      └─────────────┘
       │                    │                    │
    ┌──┴──┐              ┌──┴──┐              ┌──┴──┐
    │ │ │ │              │ │ │ │              │ │ │ │
  (instances)          (instances)          (instances)

13.12 Key Takeaways

L4 vs L7

L4 is fast but simple. L7 is smart but more overhead.

Algorithm Matters

Choose based on your traffic pattern and server capabilities.

Health Checks

Always configure proper health checks to avoid routing to dead servers.

SSL Termination

Offload SSL to load balancers to simplify certificate management.

Next Module

Module 14: Network Troubleshooting

Master the tools and techniques for diagnosing network issues.