> ## Documentation Index
> Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Load Balancing & Proxies

> Master load balancers, reverse proxies, and how traffic is distributed across servers

# Module 13: Load Balancing & Proxies

As applications scale, single servers can't handle the load. Load balancers distribute traffic across multiple servers, while proxies act as intermediaries for various purposes. This module covers both in depth.

Think of a load balancer like a host at a busy restaurant. When customers (requests) arrive, the host does not send everyone to the same waiter (server). They distribute customers across all available waiters based on who is least busy, who can handle the largest parties, or simply in rotation. Without the host, all customers would crowd around one waiter while the others stood idle.

<Frame>
  <img src="https://mintcdn.com/devweeekends/X0Fp4X8lMl-ZftoO/images/courses/networking-mastery/load-balancer.svg?fit=max&auto=format&n=X0Fp4X8lMl-ZftoO&q=85&s=0391236e3998c67339577b03f15dae4e" alt="Load Balancer Architecture" width="1080" height="1080" data-path="images/courses/networking-mastery/load-balancer.svg" />
</Frame>

<Info>
  **Estimated Time**: 3-4 hours\
  **Difficulty**: Intermediate\
  **Prerequisites**: Modules 4-6 (Network fundamentals)
</Info>

***

## 13.1 Why Load Balancing?

### The Single Server Problem

```
                    100,000 users
                         │
                         ▼
                  ┌──────────────┐
                  │    Server    │  ← Overwhelmed!
                  │  (1 CPU,     │     - High latency
                  │   8GB RAM)   │     - Requests timeout
                  └──────────────┘     - Single point of failure
```

### The Solution: Distribute Load

```
                    100,000 users
                         │
                         ▼
                  ┌──────────────┐
                  │    Load      │
                  │   Balancer   │
                  └──────┬───────┘
            ┌────────────┼────────────┐
            │            │            │
            ▼            ▼            ▼
      ┌──────────┐ ┌──────────┐ ┌──────────┐
      │ Server 1 │ │ Server 2 │ │ Server 3 │
      │ ~33,333  │ │ ~33,333  │ │ ~33,333  │
      │  users   │ │  users   │ │  users   │
      └──────────┘ └──────────┘ └──────────┘
```

***

## 13.2 Load Balancing Algorithms

### 1. Round Robin

Requests are distributed sequentially.

```
Request 1 → Server A
Request 2 → Server B
Request 3 → Server C
Request 4 → Server A  ← Cycle repeats
Request 5 → Server B
...
```

**Pros**: Simple, fair distribution
**Cons**: Doesn't consider server load or capacity

### 2. Weighted Round Robin

Servers with higher capacity get more requests.

```
Server A (weight 3): Gets 3 out of 6 requests
Server B (weight 2): Gets 2 out of 6 requests
Server C (weight 1): Gets 1 out of 6 requests

Sequence: A, A, A, B, B, C, A, A, A, B, B, C...
```

### 3. Least Connections

Send to server with fewest active connections.

```
Server A: 150 connections
Server B: 50 connections  ← Next request goes here
Server C: 200 connections
```

**Best for**: Long-lived connections (WebSockets, database connections)

### 4. Least Response Time

Send to server with fastest response time + fewest connections.

```
Server A: 50ms avg, 100 connections
Server B: 30ms avg, 120 connections  ← Faster, preferred
Server C: 80ms avg, 80 connections
```

### 5. IP Hash

Same client IP always goes to same server.

```
hash(client_ip) % num_servers = target_server

IP 192.168.1.10 → hash → always Server B
IP 192.168.1.11 → hash → always Server A
```

**Use Case**: Session persistence without sticky sessions

### 6. Least Bandwidth

Route to server currently serving least traffic (Mbps).

### Algorithm Comparison

| Algorithm           | Best For              | Drawback            |
| ------------------- | --------------------- | ------------------- |
| Round Robin         | Simple, equal servers | Ignores load        |
| Weighted RR         | Mixed capacity        | Static weights      |
| Least Connections   | Long connections      | Overhead tracking   |
| Least Response Time | Performance-critical  | Needs health checks |
| IP Hash             | Session stickiness    | Uneven distribution |

***

## 13.3 Layer 4 vs Layer 7 Load Balancing

### Layer 4 (Transport Layer)

Operates on TCP/UDP level. Sees IP addresses and ports only. It does not terminate the TCP connection or inspect the payload -- it simply forwards packets based on IP and port information.

```
Client ──TCP──→ Load Balancer ──TCP──→ Server
                     │
         Sees: Source IP, Dest IP
               Source Port, Dest Port
         Cannot see: HTTP headers, URLs, cookies
```

**Pros**: Very fast (no packet inspection), simple, handles any TCP/UDP protocol (not just HTTP)
**Cons**: Limited routing options, no content-based decisions, cannot do SSL termination

**Analogy**: A Layer 4 LB is like a highway toll booth that reads license plates and directs cars to different lanes, but never looks inside the car.

### Layer 7 (Application Layer)

Operates on HTTP/HTTPS level. The load balancer terminates the client's TCP connection, reads the full HTTP request, makes a routing decision, and then opens a **new** TCP connection to the backend server.

```
Client ──HTTP──→ Load Balancer ──HTTP──→ Server
                       │
          Sees: Everything!
                - URL path (/api/users)
                - Headers (Host, Cookie, Auth)
                - HTTP method (GET, POST)
                - Query parameters
                - Request body
```

**Pros**: Smart routing (route `/api/*` to API servers, `/static/*` to CDN), SSL termination, caching, compression, header injection (X-Forwarded-For), A/B testing
**Cons**: More processing overhead, higher latency (must parse HTTP), only works for HTTP/HTTPS

**Analogy**: A Layer 7 LB is like a hotel concierge who reads your request, understands what you need, and directs you to the right department -- they actually open and read the letter.

<Note>
  **Which should you use?** Default to Layer 7 for web applications -- the routing flexibility and SSL termination are worth the small overhead. Use Layer 4 for non-HTTP protocols (databases, game servers, custom TCP services) or when you need maximum throughput and do not need content-based routing.
</Note>

### When to Use Which

| Scenario                     | Layer |
| ---------------------------- | ----- |
| Simple TCP load distribution | L4    |
| Route by URL path            | L7    |
| SSL termination              | L7    |
| Gaming servers, databases    | L4    |
| Microservices routing        | L7    |
| A/B testing                  | L7    |

***

## 13.4 Proxy Types

### Forward Proxy

Sits between **clients** and the internet. Clients know about it.

```
┌─────────────────────────────────────────────────────┐
│                    Corporate Network                │
│                                                     │
│  ┌────────┐                                        │
│  │Client A│──┐                                     │
│  └────────┘  │     ┌──────────┐                   │
│              ├────►│  Forward │                   │
│  ┌────────┐  │     │  Proxy   │────► Internet    │
│  │Client B│──┘     └──────────┘                   │
│  └────────┘                                        │
└─────────────────────────────────────────────────────┘
```

**Use Cases:**

* Content filtering (block certain sites)
* Caching (reduce bandwidth)
* Anonymity (hide client IPs)
* Access control (authentication)

### Reverse Proxy

Sits between internet and **servers**. Clients don't know about it.

```
                        ┌─────────────────────────────┐
                        │       Data Center           │
                        │                             │
Internet ────►  ┌───────────────┐   ┌────────────┐   │
                │    Reverse    │──►│  Server A  │   │
                │     Proxy     │   └────────────┘   │
                │               │   ┌────────────┐   │
                │  (nginx, etc) │──►│  Server B  │   │
                └───────────────┘   └────────────┘   │
                        │                             │
                        └─────────────────────────────┘
```

**Use Cases:**

* Load balancing
* SSL termination
* Caching static content
* DDoS protection
* Compression
* URL rewriting

***

## 13.5 SSL/TLS Termination

### The Problem

Encryption is CPU-intensive. Every server doing SSL:

```
Client ──HTTPS──► LB ──HTTPS──► Server 1 (decrypts)
                     ──HTTPS──► Server 2 (decrypts)
                     ──HTTPS──► Server 3 (decrypts)

Each server needs:
- SSL certificates
- CPU for encryption/decryption
```

### The Solution: SSL Termination

Load balancer handles all SSL:

```
Client ──HTTPS──► Load Balancer ──HTTP──► Server 1
         (encrypted)  (decrypts)   (plain)   Server 2
                                            Server 3
```

**Benefits:**

* Servers freed from SSL overhead
* Single place to manage certificates
* Easier SSL certificate renewal

### SSL Passthrough vs Termination vs Re-encryption

```
PASSTHROUGH (L4):
Client ──HTTPS──► LB ──HTTPS──► Server
                  (just forwards, doesn't decrypt)

TERMINATION:
Client ──HTTPS──► LB ──HTTP──► Server
                  (decrypts, internal is plain)

RE-ENCRYPTION:
Client ──HTTPS──► LB ──HTTPS──► Server
                  (decrypts, re-encrypts for backend)
```

**When to use which**: Passthrough is for when compliance requires end-to-end encryption and the LB must not see plaintext -- but you lose L7 routing. Termination is the most common choice: it lets the LB do smart routing and offloads CPU from backends, and internal traffic is usually on a trusted private network. Re-encryption is the compromise: the LB can still inspect and route traffic, but internal traffic is also encrypted -- required by some compliance regimes (PCI DSS, HIPAA) even within private networks.

<Tip>
  **Troubleshooting load balancer issues**: If your LB returns 502 (Bad Gateway), the LB is healthy but cannot reach the backend -- check backend health, security groups, and whether the backend is listening on the expected port. If it returns 503 (Service Unavailable), all backends have failed health checks. If you see uneven traffic distribution, verify your health check configuration -- a backend that is slow but technically "healthy" will keep receiving traffic. Use `curl -H "Host: your-domain.com" http://LB-IP` to test directly against the LB, bypassing DNS.
</Tip>

***

## 13.6 Session Persistence (Sticky Sessions)

Some applications need the same user to always reach the same server.

### Why Needed?

```
Request 1: Login to Server A (session created in memory)
Request 2: Goes to Server B (no session! User logged out!)
```

### Sticky Session Methods

**1. Cookie-based:**

```
Load balancer sets cookie: SERVERID=server-a
All requests with this cookie → Server A
```

**2. IP Hash:**

```
Same IP always routes to same server
(breaks with shared IPs, NAT)
```

**3. Application Cookie:**

```
App sets session cookie
LB reads it and routes accordingly
```

### Better Alternative: Shared Session Store

```
┌───────────────────────────────────────────────────┐
│                  Load Balancer                    │
│                (no sticky sessions)               │
└───────────────────────┬───────────────────────────┘
          ┌─────────────┼─────────────┐
          │             │             │
          ▼             ▼             ▼
     ┌─────────┐   ┌─────────┐   ┌─────────┐
     │Server A │   │Server B │   │Server C │
     └────┬────┘   └────┬────┘   └────┬────┘
          │             │             │
          └─────────────┼─────────────┘
                        ▼
                 ┌─────────────┐
                 │    Redis    │
                 │  (sessions) │
                 └─────────────┘
```

All servers share sessions -- any server can handle any request. This is the preferred pattern because it makes your backend truly stateless and horizontally scalable. If a server dies, no sessions are lost.

<Tip>
  **Best practice**: Design your applications to be stateless from the start. Store sessions in Redis, DynamoDB, or a similar external store. This eliminates the need for sticky sessions entirely and makes scaling, deployment, and failover dramatically simpler. Sticky sessions are a crutch for stateful applications -- they work, but they limit your ability to scale and create uneven load distribution.
</Tip>

***

## 13.7 Health Checks

Load balancers must know which servers are healthy.

### Passive Health Checks

Monitor real traffic for failures:

```
Server A: 5 consecutive 500 errors → mark unhealthy
Server A: 10 successful responses → mark healthy again
```

### Active Health Checks

Periodically probe servers:

```
Every 5 seconds:
  GET /health → 200 OK? Server is healthy
  GET /health → timeout/500? Server is unhealthy
```

### Health Check Configuration

```nginx theme={null}
# Nginx example
upstream backend {
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
    server 10.0.1.3:8080;
    
    # Health check parameters
    health_check interval=5s fails=3 passes=2;
}
```

### Health Check Endpoints

Your application should expose health endpoints. Think of these like a doctor's checkup -- each endpoint tests something different:

```
/health        -> Simple "I'm alive" (heartbeat -- am I running at all?)
/health/ready  -> "I'm ready to serve traffic" (can I handle requests? DB connected? Cache warm?)
/health/live   -> "I'm running (not deadlocked)" (is my process responsive, even if dependencies are down?)
```

The distinction between liveness and readiness matters in Kubernetes especially. A pod that fails the liveness check gets restarted (the process is stuck). A pod that fails the readiness check stops receiving traffic but is not restarted (it might be warming up or waiting for a dependency). Confusing these two causes either unnecessary restarts or traffic being sent to pods that cannot handle it.

```json theme={null}
// /health response
{
  "status": "healthy",
  "checks": {
    "database": "connected",
    "cache": "connected",
    "disk": "ok"
  },
  "version": "1.2.3"
}
```

***

## 13.8 Common Load Balancer Products

### Cloud Load Balancers

| Provider | L4            | L7                  | Global             |
| -------- | ------------- | ------------------- | ------------------ |
| AWS      | NLB           | ALB                 | Global Accelerator |
| GCP      | Network LB    | HTTP(S) LB          | Cloud CDN          |
| Azure    | Load Balancer | Application Gateway | Front Door         |

### Software Load Balancers

| Product     | Type  | Use Case                   |
| ----------- | ----- | -------------------------- |
| **Nginx**   | L7    | Web apps, reverse proxy    |
| **HAProxy** | L4/L7 | High-performance, TCP/HTTP |
| **Traefik** | L7    | Kubernetes, microservices  |
| **Envoy**   | L7    | Service mesh, cloud-native |

### Hardware Load Balancers

| Vendor | Products      |
| ------ | ------------- |
| F5     | BIG-IP        |
| Citrix | ADC/NetScaler |
| A10    | Thunder       |

***

## 13.9 Load Balancing Patterns

### Blue-Green Deployment

```
Active (Blue):  100% traffic → v1.0
Standby (Green): 0% traffic → v2.0 (testing)

Switch:
Blue:   0% traffic
Green: 100% traffic → v2.0 (now live)
```

### Canary Deployment

```
Main:   95% traffic → v1.0
Canary:  5% traffic → v2.0 (testing with real users)

Gradually increase canary:
Main:   90% → 50% → 10% → 0%
Canary: 10% → 50% → 90% → 100%
```

### A/B Testing

```
Route based on user attributes:
User in test group A → Server with feature X
User in test group B → Server without feature X

Analyze which performs better
```

***

## 13.10 Reverse Proxy Use Cases

### URL Rewriting

```
External:                    Internal:
/api/users      ──────►      /v2/users-service/list
/api/products   ──────►      /product-catalog/all
```

### Path-Based Routing

```nginx theme={null}
# Nginx config
location /api/ {
    proxy_pass http://api-servers;
}

location /static/ {
    proxy_pass http://cdn-servers;
}

location / {
    proxy_pass http://web-servers;
}
```

### Header-Based Routing

```
Host: api.example.com    → API servers
Host: www.example.com    → Web servers
Host: admin.example.com  → Admin servers
```

### Caching

```
┌────────┐      ┌───────────────┐      ┌────────┐
│ Client │ ──►  │ Reverse Proxy │ ──►  │ Server │
└────────┘      │   (cached)    │      └────────┘
                └───────────────┘
                
First request: Miss, fetch from server, cache
Second request: Hit, serve from cache (no server hit)
```

***

## 13.11 Real-World Architecture

### Typical Web Application

```
┌─────────────────────────────────────────────────────────────┐
│                                                             │
│  User → CDN → WAF → Load Balancer → Web Servers            │
│                           │                                 │
│                           ├──────────► API Servers         │
│                           │                │                │
│                           │                ▼                │
│                           │         ┌──────────────┐       │
│                           │         │   Database   │       │
│                           │         │    (Redis)   │       │
│                           │         └──────────────┘       │
│                           │                                 │
└─────────────────────────────────────────────────────────────┘
```

### Microservices Architecture

```
                    ┌────────────────┐
                    │  API Gateway   │
                    │  (L7 routing)  │
                    └───────┬────────┘
       ┌────────────────────┼────────────────────┐
       │                    │                    │
       ▼                    ▼                    ▼
┌─────────────┐      ┌─────────────┐      ┌─────────────┐
│ User Service│      │Order Service│      │Product Svc  │
│     LB      │      │     LB      │      │     LB      │
└─────────────┘      └─────────────┘      └─────────────┘
       │                    │                    │
    ┌──┴──┐              ┌──┴──┐              ┌──┴──┐
    │ │ │ │              │ │ │ │              │ │ │ │
  (instances)          (instances)          (instances)
```

***

## 13.12 Key Takeaways

<CardGroup cols={2}>
  <Card title="L4 vs L7" icon="layer-group">
    L4 is fast but simple. L7 is smart but more overhead.
  </Card>

  <Card title="Algorithm Matters" icon="code-branch">
    Choose based on your traffic pattern and server capabilities.
  </Card>

  <Card title="Health Checks" icon="heart-pulse">
    Always configure proper health checks to avoid routing to dead servers.
  </Card>

  <Card title="SSL Termination" icon="lock">
    Offload SSL to load balancers to simplify certificate management.
  </Card>
</CardGroup>

***

## Next Module

<Card title="Module 14: Network Troubleshooting" icon="arrow-right" href="/courses/networking-mastery/14-troubleshooting">
  Master the tools and techniques for diagnosing network issues.
</Card>