Skip to main content

Load Balancing Deep Dive

Load balancing is critical for distributing traffic across service instances efficiently and reliably.
Learning Objectives:
  • Understand client-side vs server-side load balancing
  • Master load balancing algorithms
  • Implement health checking strategies
  • Build intelligent load balancing with Node.js

Client-Side vs Server-Side Load Balancing

┌─────────────────────────────────────────────────────────────────────────────┐
│                    LOAD BALANCING APPROACHES                                 │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  SERVER-SIDE LOAD BALANCING                                                 │
│  ──────────────────────────────                                             │
│                                                                              │
│  ┌──────────┐         ┌──────────────┐         ┌──────────────┐            │
│  │  Client  │────────▶│ Load Balancer│────────▶│ Service A-1  │            │
│  └──────────┘         │  (nginx/HAP) │         ├──────────────┤            │
│                       └──────────────┘────────▶│ Service A-2  │            │
│                                       └───────▶│ Service A-3  │            │
│                                                └──────────────┘            │
│                                                                              │
│  Pros: Simple for clients, centralized control                              │
│  Cons: Single point of failure, extra hop, limited to L4/L7                │
│                                                                              │
│  ═══════════════════════════════════════════════════════════════════════   │
│                                                                              │
│  CLIENT-SIDE LOAD BALANCING                                                 │
│  ────────────────────────────                                               │
│                                                                              │
│  ┌──────────┐         ┌──────────────┐                                     │
│  │  Client  │◀────────│   Service    │         ┌──────────────┐            │
│  │  + LB    │         │   Registry   │────────▶│ Service A-1  │            │
│  │  Logic   │────────────────────────────────▶│ Service A-2  │            │
│  └──────────┘         └──────────────┘└───────▶│ Service A-3  │            │
│                                                └──────────────┘            │
│                                                                              │
│  Pros: No extra hop, distributed (no SPOF), more intelligent               │
│  Cons: Complex clients, language-specific implementations                   │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Server-Side: NGINX Configuration

# nginx.conf - Production-ready load balancing

upstream user_service {
    # Load balancing algorithm
    least_conn;  # Send to server with fewest active connections
    
    # Backend servers with weights and health
    server user-1.internal:3000 weight=5 max_fails=3 fail_timeout=30s;
    server user-2.internal:3000 weight=5 max_fails=3 fail_timeout=30s;
    server user-3.internal:3000 weight=3 backup;  # Backup server
    
    # Keepalive connections to backends
    keepalive 32;
    keepalive_timeout 60s;
}

upstream order_service {
    # IP Hash - session affinity
    ip_hash;
    
    server order-1.internal:3001;
    server order-2.internal:3001;
    server order-3.internal:3001;
}

server {
    listen 80;
    
    location /api/users {
        proxy_pass http://user_service;
        
        # Health check headers
        proxy_next_upstream error timeout http_502 http_503 http_504;
        proxy_next_upstream_tries 3;
        proxy_connect_timeout 5s;
        proxy_read_timeout 30s;
        
        # Keepalive
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
    
    location /api/orders {
        proxy_pass http://order_service;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
}

Client-Side: Node.js Implementation

// client-side-lb.js - Client-side load balancing with service discovery

const EventEmitter = require('events');

class ClientSideLoadBalancer extends EventEmitter {
  constructor(options = {}) {
    super();
    this.serviceName = options.serviceName;
    this.registry = options.registry;
    this.algorithm = options.algorithm || 'round-robin';
    this.healthCheckInterval = options.healthCheckInterval || 10000;
    
    this.instances = [];
    this.currentIndex = 0;
    this.healthStatus = new Map();
    
    this.initializeDiscovery();
    this.startHealthChecks();
  }

  async initializeDiscovery() {
    // Initial fetch from service registry
    await this.refreshInstances();
    
    // Subscribe to registry updates
    this.registry.on('instances-changed', (serviceName) => {
      if (serviceName === this.serviceName) {
        this.refreshInstances();
      }
    });
  }

  async refreshInstances() {
    try {
      const instances = await this.registry.getInstances(this.serviceName);
      this.instances = instances.map(instance => ({
        ...instance,
        weight: instance.weight || 1,
        activeConnections: 0,
        responseTime: 0,
        consecutiveFailures: 0
      }));
      this.emit('instances-updated', this.instances);
    } catch (error) {
      this.emit('error', error);
    }
  }

  startHealthChecks() {
    setInterval(async () => {
      for (const instance of this.instances) {
        try {
          const start = Date.now();
          const response = await fetch(`http://${instance.host}:${instance.port}/health`, {
            timeout: 5000
          });
          const latency = Date.now() - start;
          
          this.healthStatus.set(instance.id, {
            healthy: response.ok,
            latency,
            lastCheck: Date.now()
          });
          
          instance.responseTime = latency;
          instance.consecutiveFailures = 0;
        } catch (error) {
          const status = this.healthStatus.get(instance.id) || {};
          instance.consecutiveFailures++;
          this.healthStatus.set(instance.id, {
            ...status,
            healthy: false,
            lastCheck: Date.now(),
            error: error.message
          });
        }
      }
    }, this.healthCheckInterval);
  }

  // Get next available instance based on algorithm
  getNextInstance() {
    const healthyInstances = this.instances.filter(
      i => (this.healthStatus.get(i.id)?.healthy !== false) && 
           i.consecutiveFailures < 3
    );

    if (healthyInstances.length === 0) {
      throw new Error(`No healthy instances available for ${this.serviceName}`);
    }

    switch (this.algorithm) {
      case 'round-robin':
        return this.roundRobin(healthyInstances);
      case 'weighted-round-robin':
        return this.weightedRoundRobin(healthyInstances);
      case 'least-connections':
        return this.leastConnections(healthyInstances);
      case 'least-response-time':
        return this.leastResponseTime(healthyInstances);
      case 'random':
        return this.random(healthyInstances);
      default:
        return this.roundRobin(healthyInstances);
    }
  }

  roundRobin(instances) {
    const instance = instances[this.currentIndex % instances.length];
    this.currentIndex++;
    return instance;
  }

  weightedRoundRobin(instances) {
    // Create weighted list
    const weighted = [];
    for (const instance of instances) {
      for (let i = 0; i < instance.weight; i++) {
        weighted.push(instance);
      }
    }
    
    const instance = weighted[this.currentIndex % weighted.length];
    this.currentIndex++;
    return instance;
  }

  leastConnections(instances) {
    return instances.reduce((min, current) => 
      current.activeConnections < min.activeConnections ? current : min
    );
  }

  leastResponseTime(instances) {
    return instances.reduce((best, current) => {
      const currentScore = current.activeConnections * 0.5 + current.responseTime * 0.5;
      const bestScore = best.activeConnections * 0.5 + best.responseTime * 0.5;
      return currentScore < bestScore ? current : best;
    });
  }

  random(instances) {
    return instances[Math.floor(Math.random() * instances.length)];
  }

  // Execute request with automatic failover
  async execute(requestFn, options = {}) {
    const maxRetries = options.retries || 3;
    const retryDelay = options.retryDelay || 100;
    let lastError;

    for (let attempt = 0; attempt < maxRetries; attempt++) {
      const instance = this.getNextInstance();
      
      try {
        instance.activeConnections++;
        const start = Date.now();
        
        const result = await requestFn(instance);
        
        instance.responseTime = Date.now() - start;
        instance.activeConnections--;
        
        return result;
      } catch (error) {
        instance.activeConnections--;
        instance.consecutiveFailures++;
        lastError = error;
        
        this.emit('request-failed', { instance, error, attempt });
        
        if (attempt < maxRetries - 1) {
          await this.delay(retryDelay * Math.pow(2, attempt));
        }
      }
    }

    throw lastError;
  }

  delay(ms) {
    return new Promise(resolve => setTimeout(resolve, ms));
  }
}

// Usage example
const axios = require('axios');

const userServiceLB = new ClientSideLoadBalancer({
  serviceName: 'user-service',
  registry: serviceRegistry,
  algorithm: 'least-connections'
});

// Make requests through the load balancer
async function getUser(userId) {
  return userServiceLB.execute(async (instance) => {
    const response = await axios.get(
      `http://${instance.host}:${instance.port}/users/${userId}`,
      { timeout: 5000 }
    );
    return response.data;
  });
}

module.exports = { ClientSideLoadBalancer };

Load Balancing Algorithms

┌─────────────────────────────────────────────────────────────────────────────┐
│                    LOAD BALANCING ALGORITHMS                                 │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ROUND ROBIN                          WEIGHTED ROUND ROBIN                  │
│  ─────────────────                    ─────────────────────                 │
│                                                                              │
│  Request 1 → Server A                 Request 1 → Server A (w=5)           │
│  Request 2 → Server B                 Request 2 → Server A                 │
│  Request 3 → Server C                 Request 3 → Server A                 │
│  Request 4 → Server A                 Request 4 → Server B (w=3)           │
│  ...                                  Request 5 → Server B                 │
│                                       Request 6 → Server B                 │
│  Simple, equal distribution           Request 7 → Server C (w=2)           │
│                                       ...                                   │
│                                       Accounts for server capacity         │
│                                                                              │
│  ═══════════════════════════════════════════════════════════════════════   │
│                                                                              │
│  LEAST CONNECTIONS                    LEAST RESPONSE TIME                  │
│  ──────────────────────               ─────────────────────                 │
│                                                                              │
│  Server A: 5 conn  ←────              Server A: 50ms avg  ←────            │
│  Server B: 8 conn                     Server B: 75ms avg                   │
│  Server C: 3 conn                     Server C: 45ms avg                   │
│                    ↑                                     ↑                  │
│           Next → Server C             Next → Server C (fastest)            │
│                                                                              │
│  Best for long-lived connections      Best for varying server loads        │
│                                                                              │
│  ═══════════════════════════════════════════════════════════════════════   │
│                                                                              │
│  IP HASH                              CONSISTENT HASHING                   │
│  ────────────                         ───────────────────                   │
│                                                                              │
│  hash(client_ip) → Server B           hash(request) → Ring position        │
│                                       Minimal redistribution on change     │
│  Same client → same server                                                 │
│  Session affinity                     Good for caching, stateful services  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Advanced Algorithms Implementation

// advanced-lb-algorithms.js

class LoadBalancingAlgorithms {
  // Weighted Round Robin with smooth distribution
  static createWeightedRoundRobin(instances) {
    let currentWeight = 0;
    let maxWeight = Math.max(...instances.map(i => i.weight));
    let gcdWeight = instances.reduce((a, b) => gcd(a, b.weight), instances[0].weight);
    let currentIndex = -1;

    function gcd(a, b) {
      return b === 0 ? a : gcd(b, a % b);
    }

    return function getNext() {
      while (true) {
        currentIndex = (currentIndex + 1) % instances.length;
        
        if (currentIndex === 0) {
          currentWeight -= gcdWeight;
          if (currentWeight <= 0) {
            currentWeight = maxWeight;
          }
        }
        
        if (instances[currentIndex].weight >= currentWeight) {
          return instances[currentIndex];
        }
      }
    };
  }

  // Consistent Hashing with virtual nodes
  static createConsistentHash(instances, virtualNodes = 150) {
    const ring = new Map();
    const sortedKeys = [];

    // Add virtual nodes for each instance
    for (const instance of instances) {
      for (let i = 0; i < virtualNodes; i++) {
        const key = hash(`${instance.id}-${i}`);
        ring.set(key, instance);
        sortedKeys.push(key);
      }
    }
    
    sortedKeys.sort((a, b) => a - b);

    function hash(str) {
      let hash = 0;
      for (let i = 0; i < str.length; i++) {
        hash = ((hash << 5) - hash) + str.charCodeAt(i);
        hash = hash & hash;
      }
      return Math.abs(hash);
    }

    return function getNode(key) {
      const keyHash = hash(key);
      
      // Binary search for first key >= keyHash
      let low = 0, high = sortedKeys.length - 1;
      
      while (low < high) {
        const mid = Math.floor((low + high) / 2);
        if (sortedKeys[mid] < keyHash) {
          low = mid + 1;
        } else {
          high = mid;
        }
      }

      // Wrap around if key is larger than all
      const index = sortedKeys[low] >= keyHash ? low : 0;
      return ring.get(sortedKeys[index]);
    };
  }

  // Power of Two Choices (P2C)
  // Pick 2 random servers, choose the one with fewer connections
  static createP2C(instances) {
    return function getNext() {
      if (instances.length === 1) return instances[0];
      
      // Pick two random instances
      const i1 = Math.floor(Math.random() * instances.length);
      let i2 = Math.floor(Math.random() * instances.length);
      while (i2 === i1) {
        i2 = Math.floor(Math.random() * instances.length);
      }

      // Choose the one with fewer active connections
      return instances[i1].activeConnections <= instances[i2].activeConnections
        ? instances[i1]
        : instances[i2];
    };
  }

  // Adaptive Load Balancing (based on real-time metrics)
  static createAdaptive(instances) {
    return function getNext() {
      // Calculate scores based on multiple factors
      const scored = instances.map(instance => ({
        instance,
        score: calculateScore(instance)
      }));

      // Sort by score (lower is better)
      scored.sort((a, b) => a.score - b.score);
      
      // Weighted random from top 3
      const topN = scored.slice(0, Math.min(3, scored.length));
      const totalWeight = topN.reduce((sum, s) => sum + (1 / s.score), 0);
      
      let random = Math.random() * totalWeight;
      for (const { instance, score } of topN) {
        random -= (1 / score);
        if (random <= 0) return instance;
      }
      
      return topN[0].instance;
    };

    function calculateScore(instance) {
      // Lower score = better
      const connectionScore = instance.activeConnections * 0.3;
      const latencyScore = instance.avgResponseTime * 0.4;
      const errorScore = instance.errorRate * 100 * 0.3;
      
      return connectionScore + latencyScore + errorScore + 0.001; // Avoid division by zero
    }
  }
}

module.exports = { LoadBalancingAlgorithms };

Health Checking Strategies

┌─────────────────────────────────────────────────────────────────────────────┐
│                    HEALTH CHECK PATTERNS                                     │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  LIVENESS vs READINESS                                                      │
│  ─────────────────────────                                                  │
│                                                                              │
│  LIVENESS: Is the process running?                                          │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │  GET /healthz                                                        │   │
│  │  → 200 OK: Process is alive                                          │   │
│  │  → 5xx: Process is dead, restart it                                  │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
│  READINESS: Can it handle traffic?                                          │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │  GET /ready                                                          │   │
│  │  → 200 OK: Ready to receive traffic                                  │   │
│  │  → 503: Not ready (warming up, dependencies down)                    │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
│  DEEP HEALTH CHECK (with dependencies)                                      │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │  GET /health/deep                                                    │   │
│  │  {                                                                   │   │
│  │    "status": "degraded",                                             │   │
│  │    "checks": {                                                       │   │
│  │      "database": { "status": "healthy", "latency": "5ms" },         │   │
│  │      "redis": { "status": "healthy", "latency": "2ms" },            │   │
│  │      "external-api": { "status": "unhealthy", "error": "timeout" } │   │
│  │    }                                                                 │   │
│  │  }                                                                   │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Comprehensive Health Check Service

// health-check-service.js

const express = require('express');

class HealthCheckService {
  constructor() {
    this.checks = new Map();
    this.status = 'starting';
    this.startTime = Date.now();
  }

  // Register a health check
  registerCheck(name, checkFn, options = {}) {
    this.checks.set(name, {
      fn: checkFn,
      critical: options.critical !== false, // Default to critical
      timeout: options.timeout || 5000,
      interval: options.interval || 30000,
      lastResult: null,
      lastCheck: null
    });

    // Start periodic checking
    if (options.interval) {
      setInterval(() => this.runCheck(name), options.interval);
    }
  }

  async runCheck(name) {
    const check = this.checks.get(name);
    if (!check) return null;

    const startTime = Date.now();
    
    try {
      const result = await Promise.race([
        check.fn(),
        new Promise((_, reject) => 
          setTimeout(() => reject(new Error('Health check timeout')), check.timeout)
        )
      ]);

      check.lastResult = {
        status: 'healthy',
        latency: Date.now() - startTime,
        ...result
      };
    } catch (error) {
      check.lastResult = {
        status: 'unhealthy',
        error: error.message,
        latency: Date.now() - startTime
      };
    }

    check.lastCheck = Date.now();
    return check.lastResult;
  }

  async runAllChecks() {
    const results = {};
    
    for (const [name, check] of this.checks) {
      results[name] = await this.runCheck(name);
    }

    return results;
  }

  getOverallStatus(checkResults) {
    const criticalChecks = Array.from(this.checks.entries())
      .filter(([_, check]) => check.critical)
      .map(([name]) => name);

    const hasUnhealthyCritical = criticalChecks.some(
      name => checkResults[name]?.status === 'unhealthy'
    );

    const hasAnyUnhealthy = Object.values(checkResults).some(
      r => r?.status === 'unhealthy'
    );

    if (hasUnhealthyCritical) return 'unhealthy';
    if (hasAnyUnhealthy) return 'degraded';
    return 'healthy';
  }

  // Express middleware for health endpoints
  createRouter() {
    const router = express.Router();

    // Liveness probe - is the process running?
    router.get('/healthz', (req, res) => {
      res.status(200).json({
        status: 'alive',
        uptime: Date.now() - this.startTime
      });
    });

    // Readiness probe - can we handle traffic?
    router.get('/ready', async (req, res) => {
      if (this.status !== 'ready') {
        return res.status(503).json({
          status: this.status,
          message: 'Service not ready'
        });
      }

      // Quick check of critical dependencies
      const criticalResults = {};
      for (const [name, check] of this.checks) {
        if (check.critical) {
          criticalResults[name] = check.lastResult;
        }
      }

      const hasUnhealthy = Object.values(criticalResults).some(
        r => r?.status === 'unhealthy'
      );

      if (hasUnhealthy) {
        return res.status(503).json({
          status: 'not_ready',
          checks: criticalResults
        });
      }

      res.status(200).json({ status: 'ready' });
    });

    // Deep health check - detailed status of all dependencies
    router.get('/health', async (req, res) => {
      const checkResults = await this.runAllChecks();
      const overallStatus = this.getOverallStatus(checkResults);

      const statusCode = overallStatus === 'healthy' ? 200 : 
                         overallStatus === 'degraded' ? 200 : 503;

      res.status(statusCode).json({
        status: overallStatus,
        timestamp: new Date().toISOString(),
        uptime: Date.now() - this.startTime,
        version: process.env.APP_VERSION || 'unknown',
        checks: checkResults
      });
    });

    return router;
  }

  setReady() {
    this.status = 'ready';
  }

  setNotReady(reason) {
    this.status = reason || 'not_ready';
  }
}

// Usage example
const healthService = new HealthCheckService();

// Register database check
healthService.registerCheck('database', async () => {
  const start = Date.now();
  await pool.query('SELECT 1');
  return { latency: Date.now() - start };
}, { critical: true, interval: 30000 });

// Register Redis check
healthService.registerCheck('redis', async () => {
  const start = Date.now();
  await redis.ping();
  return { latency: Date.now() - start };
}, { critical: true, interval: 30000 });

// Register external API check
healthService.registerCheck('payment-api', async () => {
  const response = await fetch('https://api.stripe.com/v1/health', {
    timeout: 5000
  });
  return { status: response.ok ? 'reachable' : 'unreachable' };
}, { critical: false, interval: 60000 });

// Mount health routes
app.use(healthService.createRouter());

// Mark service as ready after initialization
await initializeDatabase();
await warmUpCaches();
healthService.setReady();

module.exports = { HealthCheckService };

Load Balancer Patterns

Kubernetes Service Load Balancing

# kubernetes-lb.yaml

# ClusterIP Service (internal load balancing)
apiVersion: v1
kind: Service
metadata:
  name: user-service
spec:
  type: ClusterIP
  selector:
    app: user-service
  ports:
    - port: 80
      targetPort: 3000
  sessionAffinity: None  # or ClientIP for sticky sessions

---
# Headless Service (for client-side LB with service discovery)
apiVersion: v1
kind: Service
metadata:
  name: user-service-headless
spec:
  clusterIP: None
  selector:
    app: user-service
  ports:
    - port: 3000

---
# Deployment with health checks
apiVersion: apps/v1
kind: Deployment
metadata:
  name: user-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: user-service
  template:
    metadata:
      labels:
        app: user-service
    spec:
      containers:
        - name: user-service
          image: user-service:latest
          ports:
            - containerPort: 3000
          
          # Liveness probe - restart if fails
          livenessProbe:
            httpGet:
              path: /healthz
              port: 3000
            initialDelaySeconds: 10
            periodSeconds: 15
            failureThreshold: 3
            
          # Readiness probe - remove from LB if fails
          readinessProbe:
            httpGet:
              path: /ready
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10
            failureThreshold: 3
            
          # Startup probe - for slow starting containers
          startupProbe:
            httpGet:
              path: /healthz
              port: 3000
            failureThreshold: 30
            periodSeconds: 10

Envoy Proxy Configuration

# envoy-lb.yaml - Advanced L7 load balancing

static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          address: 0.0.0.0
          port_value: 8080
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                stat_prefix: ingress_http
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: backend
                      domains: ["*"]
                      routes:
                        - match:
                            prefix: "/api/users"
                          route:
                            cluster: user_service
                            timeout: 30s
                            retry_policy:
                              retry_on: "5xx,reset,connect-failure"
                              num_retries: 3
                              per_try_timeout: 10s

  clusters:
    - name: user_service
      connect_timeout: 5s
      type: STRICT_DNS
      lb_policy: LEAST_REQUEST  # Least connections
      
      # Circuit breaker
      circuit_breakers:
        thresholds:
          - priority: DEFAULT
            max_connections: 1000
            max_pending_requests: 1000
            max_requests: 1000
            max_retries: 3
            
      # Health checking
      health_checks:
        - timeout: 5s
          interval: 10s
          unhealthy_threshold: 3
          healthy_threshold: 2
          http_health_check:
            path: "/health"
            
      # Outlier detection (automatic ejection of unhealthy hosts)
      outlier_detection:
        consecutive_5xx: 5
        interval: 10s
        base_ejection_time: 30s
        max_ejection_percent: 50
        
      load_assignment:
        cluster_name: user_service
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: user-1.internal
                      port_value: 3000
              - endpoint:
                  address:
                    socket_address:
                      address: user-2.internal
                      port_value: 3000
              - endpoint:
                  address:
                    socket_address:
                      address: user-3.internal
                      port_value: 3000

Interview Questions

Answer:Server-side (e.g., NGINX, HAProxy):
  • Single point for routing
  • Simple clients
  • Extra network hop
  • Centralized control
Client-side (e.g., Ribbon, custom):
  • Client decides which server
  • No extra hop
  • More complex clients
  • Better for microservices
When to use each:
  • Server-side: External traffic, legacy clients
  • Client-side: Service-to-service within cluster
  • Hybrid: Edge LB + client-side internally
Answer:Use cases:
  • Cache servers (minimize cache misses on scale)
  • Session affinity without IP hash
  • Partitioned data (same key → same server)
How it works:
  • Servers and keys mapped to same hash ring
  • Key routed to next server clockwise
  • Adding/removing server affects only neighbors
Virtual nodes:
  • Multiple positions per server for balance
  • 100-200 virtual nodes per physical server
Answer:Liveness:
  • “Is the process stuck?”
  • Failure → container restart
  • Should be simple (no dependencies)
  • Example: Can the HTTP server respond?
Readiness:
  • “Can it handle traffic?”
  • Failure → remove from load balancer
  • Can check dependencies
  • Example: Is database connection ready?
Common mistake: Using deep checks for liveness causes cascading restarts when a dependency is down.
Answer:Simple but effective algorithm:
  1. Pick 2 random servers
  2. Choose the one with fewer connections
Why it works:
  • Avoids herd behavior (all clients picking same “best” server)
  • O(1) complexity (no sorting)
  • Statistical guarantees: max load ~log(log(n))
Used by: Envoy, HAProxy, Netflix ZuulBetter than round-robin because it considers actual load.

Chapter Summary

Key Takeaways:
  • Server-side LB for external traffic, client-side for internal
  • Algorithm choice depends on workload: Round-robin for simple, Least-connections for varying load
  • Consistent hashing for caching and stateful services
  • Implement both liveness and readiness probes
  • Health checks should have appropriate timeouts
  • Use circuit breakers with load balancing for resilience
Next Chapter: Migration Patterns - Strangler Fig, Branch by Abstraction, and more.