Load Balancing Deep Dive

Load balancing is critical for distributing traffic across service instances efficiently and reliably.

Learning Objectives:

Understand client-side vs server-side load balancing
Master load balancing algorithms
Implement health checking strategies
Build intelligent load balancing with Node.js

Client-Side vs Server-Side Load Balancing

┌─────────────────────────────────────────────────────────────────────────────┐
│                    LOAD BALANCING APPROACHES                                 │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  SERVER-SIDE LOAD BALANCING                                                 │
│  ──────────────────────────────                                             │
│                                                                              │
│  ┌──────────┐         ┌──────────────┐         ┌──────────────┐            │
│  │  Client  │────────▶│ Load Balancer│────────▶│ Service A-1  │            │
│  └──────────┘         │  (nginx/HAP) │         ├──────────────┤            │
│                       └──────────────┘────────▶│ Service A-2  │            │
│                                       └───────▶│ Service A-3  │            │
│                                                └──────────────┘            │
│                                                                              │
│  Pros: Simple for clients, centralized control                              │
│  Cons: Single point of failure, extra hop, limited to L4/L7                │
│                                                                              │
│  ═══════════════════════════════════════════════════════════════════════   │
│                                                                              │
│  CLIENT-SIDE LOAD BALANCING                                                 │
│  ────────────────────────────                                               │
│                                                                              │
│  ┌──────────┐         ┌──────────────┐                                     │
│  │  Client  │◀────────│   Service    │         ┌──────────────┐            │
│  │  + LB    │         │   Registry   │────────▶│ Service A-1  │            │
│  │  Logic   │────────────────────────────────▶│ Service A-2  │            │
│  └──────────┘         └──────────────┘└───────▶│ Service A-3  │            │
│                                                └──────────────┘            │
│                                                                              │
│  Pros: No extra hop, distributed (no SPOF), more intelligent               │
│  Cons: Complex clients, language-specific implementations                   │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Server-Side: NGINX Configuration

# nginx.conf - Production-ready load balancing

upstream user_service {
    # Load balancing algorithm
    least_conn;  # Send to server with fewest active connections
    
    # Backend servers with weights and health
    server user-1.internal:3000 weight=5 max_fails=3 fail_timeout=30s;
    server user-2.internal:3000 weight=5 max_fails=3 fail_timeout=30s;
    server user-3.internal:3000 weight=3 backup;  # Backup server
    
    # Keepalive connections to backends
    keepalive 32;
    keepalive_timeout 60s;
}

upstream order_service {
    # IP Hash - session affinity
    ip_hash;
    
    server order-1.internal:3001;
    server order-2.internal:3001;
    server order-3.internal:3001;
}

server {
    listen 80;
    
    location /api/users {
        proxy_pass http://user_service;
        
        # Health check headers
        proxy_next_upstream error timeout http_502 http_503 http_504;
        proxy_next_upstream_tries 3;
        proxy_connect_timeout 5s;
        proxy_read_timeout 30s;
        
        # Keepalive
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
    
    location /api/orders {
        proxy_pass http://order_service;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
}

Client-Side: Node.js Implementation

// client-side-lb.js - Client-side load balancing with service discovery

const EventEmitter = require('events');

class ClientSideLoadBalancer extends EventEmitter {
  constructor(options = {}) {
    super();
    this.serviceName = options.serviceName;
    this.registry = options.registry;
    this.algorithm = options.algorithm || 'round-robin';
    this.healthCheckInterval = options.healthCheckInterval || 10000;
    
    this.instances = [];
    this.currentIndex = 0;
    this.healthStatus = new Map();
    
    this.initializeDiscovery();
    this.startHealthChecks();
  }

  async initializeDiscovery() {
    // Initial fetch from service registry
    await this.refreshInstances();
    
    // Subscribe to registry updates
    this.registry.on('instances-changed', (serviceName) => {
      if (serviceName === this.serviceName) {
        this.refreshInstances();
      }
    });
  }

  async refreshInstances() {
    try {
      const instances = await this.registry.getInstances(this.serviceName);
      this.instances = instances.map(instance => ({
        ...instance,
        weight: instance.weight || 1,
        activeConnections: 0,
        responseTime: 0,
        consecutiveFailures: 0
      }));
      this.emit('instances-updated', this.instances);
    } catch (error) {
      this.emit('error', error);
    }
  }

  startHealthChecks() {
    setInterval(async () => {
      for (const instance of this.instances) {
        try {
          const start = Date.now();
          const response = await fetch(`http://${instance.host}:${instance.port}/health`, {
            timeout: 5000
          });
          const latency = Date.now() - start;
          
          this.healthStatus.set(instance.id, {
            healthy: response.ok,
            latency,
            lastCheck: Date.now()
          });
          
          instance.responseTime = latency;
          instance.consecutiveFailures = 0;
        } catch (error) {
          const status = this.healthStatus.get(instance.id) || {};
          instance.consecutiveFailures++;
          this.healthStatus.set(instance.id, {
            ...status,
            healthy: false,
            lastCheck: Date.now(),
            error: error.message
          });
        }
      }
    }, this.healthCheckInterval);
  }

  // Get next available instance based on algorithm
  getNextInstance() {
    const healthyInstances = this.instances.filter(
      i => (this.healthStatus.get(i.id)?.healthy !== false) && 
           i.consecutiveFailures < 3
    );

    if (healthyInstances.length === 0) {
      throw new Error(`No healthy instances available for ${this.serviceName}`);
    }

    switch (this.algorithm) {
      case 'round-robin':
        return this.roundRobin(healthyInstances);
      case 'weighted-round-robin':
        return this.weightedRoundRobin(healthyInstances);
      case 'least-connections':
        return this.leastConnections(healthyInstances);
      case 'least-response-time':
        return this.leastResponseTime(healthyInstances);
      case 'random':
        return this.random(healthyInstances);
      default:
        return this.roundRobin(healthyInstances);
    }
  }

  roundRobin(instances) {
    const instance = instances[this.currentIndex % instances.length];
    this.currentIndex++;
    return instance;
  }

  weightedRoundRobin(instances) {
    // Create weighted list
    const weighted = [];
    for (const instance of instances) {
      for (let i = 0; i < instance.weight; i++) {
        weighted.push(instance);
      }
    }
    
    const instance = weighted[this.currentIndex % weighted.length];
    this.currentIndex++;
    return instance;
  }

  leastConnections(instances) {
    return instances.reduce((min, current) => 
      current.activeConnections < min.activeConnections ? current : min
    );
  }

  leastResponseTime(instances) {
    return instances.reduce((best, current) => {
      const currentScore = current.activeConnections * 0.5 + current.responseTime * 0.5;
      const bestScore = best.activeConnections * 0.5 + best.responseTime * 0.5;
      return currentScore < bestScore ? current : best;
    });
  }

  random(instances) {
    return instances[Math.floor(Math.random() * instances.length)];
  }

  // Execute request with automatic failover
  async execute(requestFn, options = {}) {
    const maxRetries = options.retries || 3;
    const retryDelay = options.retryDelay || 100;
    let lastError;

    for (let attempt = 0; attempt < maxRetries; attempt++) {
      const instance = this.getNextInstance();
      
      try {
        instance.activeConnections++;
        const start = Date.now();
        
        const result = await requestFn(instance);
        
        instance.responseTime = Date.now() - start;
        instance.activeConnections--;
        
        return result;
      } catch (error) {
        instance.activeConnections--;
        instance.consecutiveFailures++;
        lastError = error;
        
        this.emit('request-failed', { instance, error, attempt });
        
        if (attempt < maxRetries - 1) {
          await this.delay(retryDelay * Math.pow(2, attempt));
        }
      }
    }

    throw lastError;
  }

  delay(ms) {
    return new Promise(resolve => setTimeout(resolve, ms));
  }
}

// Usage example
const axios = require('axios');

const userServiceLB = new ClientSideLoadBalancer({
  serviceName: 'user-service',
  registry: serviceRegistry,
  algorithm: 'least-connections'
});

// Make requests through the load balancer
async function getUser(userId) {
  return userServiceLB.execute(async (instance) => {
    const response = await axios.get(
      `http://${instance.host}:${instance.port}/users/${userId}`,
      { timeout: 5000 }
    );
    return response.data;
  });
}

module.exports = { ClientSideLoadBalancer };

Load Balancing Algorithms

┌─────────────────────────────────────────────────────────────────────────────┐
│                    LOAD BALANCING ALGORITHMS                                 │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ROUND ROBIN                          WEIGHTED ROUND ROBIN                  │
│  ─────────────────                    ─────────────────────                 │
│                                                                              │
│  Request 1 → Server A                 Request 1 → Server A (w=5)           │
│  Request 2 → Server B                 Request 2 → Server A                 │
│  Request 3 → Server C                 Request 3 → Server A                 │
│  Request 4 → Server A                 Request 4 → Server B (w=3)           │
│  ...                                  Request 5 → Server B                 │
│                                       Request 6 → Server B                 │
│  Simple, equal distribution           Request 7 → Server C (w=2)           │
│                                       ...                                   │
│                                       Accounts for server capacity         │
│                                                                              │
│  ═══════════════════════════════════════════════════════════════════════   │
│                                                                              │
│  LEAST CONNECTIONS                    LEAST RESPONSE TIME                  │
│  ──────────────────────               ─────────────────────                 │
│                                                                              │
│  Server A: 5 conn  ←────              Server A: 50ms avg  ←────            │
│  Server B: 8 conn                     Server B: 75ms avg                   │
│  Server C: 3 conn                     Server C: 45ms avg                   │
│                    ↑                                     ↑                  │
│           Next → Server C             Next → Server C (fastest)            │
│                                                                              │
│  Best for long-lived connections      Best for varying server loads        │
│                                                                              │
│  ═══════════════════════════════════════════════════════════════════════   │
│                                                                              │
│  IP HASH                              CONSISTENT HASHING                   │
│  ────────────                         ───────────────────                   │
│                                                                              │
│  hash(client_ip) → Server B           hash(request) → Ring position        │
│                                       Minimal redistribution on change     │
│  Same client → same server                                                 │
│  Session affinity                     Good for caching, stateful services  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Advanced Algorithms Implementation

// advanced-lb-algorithms.js

class LoadBalancingAlgorithms {
  // Weighted Round Robin with smooth distribution
  static createWeightedRoundRobin(instances) {
    let currentWeight = 0;
    let maxWeight = Math.max(...instances.map(i => i.weight));
    let gcdWeight = instances.reduce((a, b) => gcd(a, b.weight), instances[0].weight);
    let currentIndex = -1;

    function gcd(a, b) {
      return b === 0 ? a : gcd(b, a % b);
    }

    return function getNext() {
      while (true) {
        currentIndex = (currentIndex + 1) % instances.length;
        
        if (currentIndex === 0) {
          currentWeight -= gcdWeight;
          if (currentWeight <= 0) {
            currentWeight = maxWeight;
          }
        }
        
        if (instances[currentIndex].weight >= currentWeight) {
          return instances[currentIndex];
        }
      }
    };
  }

  // Consistent Hashing with virtual nodes
  static createConsistentHash(instances, virtualNodes = 150) {
    const ring = new Map();
    const sortedKeys = [];

    // Add virtual nodes for each instance
    for (const instance of instances) {
      for (let i = 0; i < virtualNodes; i++) {
        const key = hash(`${instance.id}-${i}`);
        ring.set(key, instance);
        sortedKeys.push(key);
      }
    }
    
    sortedKeys.sort((a, b) => a - b);

    function hash(str) {
      let hash = 0;
      for (let i = 0; i < str.length; i++) {
        hash = ((hash << 5) - hash) + str.charCodeAt(i);
        hash = hash & hash;
      }
      return Math.abs(hash);
    }

    return function getNode(key) {
      const keyHash = hash(key);
      
      // Binary search for first key >= keyHash
      let low = 0, high = sortedKeys.length - 1;
      
      while (low < high) {
        const mid = Math.floor((low + high) / 2);
        if (sortedKeys[mid] < keyHash) {
          low = mid + 1;
        } else {
          high = mid;
        }
      }

      // Wrap around if key is larger than all
      const index = sortedKeys[low] >= keyHash ? low : 0;
      return ring.get(sortedKeys[index]);
    };
  }

  // Power of Two Choices (P2C)
  // Pick 2 random servers, choose the one with fewer connections
  static createP2C(instances) {
    return function getNext() {
      if (instances.length === 1) return instances[0];
      
      // Pick two random instances
      const i1 = Math.floor(Math.random() * instances.length);
      let i2 = Math.floor(Math.random() * instances.length);
      while (i2 === i1) {
        i2 = Math.floor(Math.random() * instances.length);
      }

      // Choose the one with fewer active connections
      return instances[i1].activeConnections <= instances[i2].activeConnections
        ? instances[i1]
        : instances[i2];
    };
  }

  // Adaptive Load Balancing (based on real-time metrics)
  static createAdaptive(instances) {
    return function getNext() {
      // Calculate scores based on multiple factors
      const scored = instances.map(instance => ({
        instance,
        score: calculateScore(instance)
      }));

      // Sort by score (lower is better)
      scored.sort((a, b) => a.score - b.score);
      
      // Weighted random from top 3
      const topN = scored.slice(0, Math.min(3, scored.length));
      const totalWeight = topN.reduce((sum, s) => sum + (1 / s.score), 0);
      
      let random = Math.random() * totalWeight;
      for (const { instance, score } of topN) {
        random -= (1 / score);
        if (random <= 0) return instance;
      }
      
      return topN[0].instance;
    };

    function calculateScore(instance) {
      // Lower score = better
      const connectionScore = instance.activeConnections * 0.3;
      const latencyScore = instance.avgResponseTime * 0.4;
      const errorScore = instance.errorRate * 100 * 0.3;
      
      return connectionScore + latencyScore + errorScore + 0.001; // Avoid division by zero
    }
  }
}

module.exports = { LoadBalancingAlgorithms };

Health Checking Strategies

┌─────────────────────────────────────────────────────────────────────────────┐
│                    HEALTH CHECK PATTERNS                                     │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  LIVENESS vs READINESS                                                      │
│  ─────────────────────────                                                  │
│                                                                              │
│  LIVENESS: Is the process running?                                          │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │  GET /healthz                                                        │   │
│  │  → 200 OK: Process is alive                                          │   │
│  │  → 5xx: Process is dead, restart it                                  │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
│  READINESS: Can it handle traffic?                                          │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │  GET /ready                                                          │   │
│  │  → 200 OK: Ready to receive traffic                                  │   │
│  │  → 503: Not ready (warming up, dependencies down)                    │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
│  DEEP HEALTH CHECK (with dependencies)                                      │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │  GET /health/deep                                                    │   │
│  │  {                                                                   │   │
│  │    "status": "degraded",                                             │   │
│  │    "checks": {                                                       │   │
│  │      "database": { "status": "healthy", "latency": "5ms" },         │   │
│  │      "redis": { "status": "healthy", "latency": "2ms" },            │   │
│  │      "external-api": { "status": "unhealthy", "error": "timeout" } │   │
│  │    }                                                                 │   │
│  │  }                                                                   │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Comprehensive Health Check Service

// health-check-service.js

const express = require('express');

class HealthCheckService {
  constructor() {
    this.checks = new Map();
    this.status = 'starting';
    this.startTime = Date.now();
  }

  // Register a health check
  registerCheck(name, checkFn, options = {}) {
    this.checks.set(name, {
      fn: checkFn,
      critical: options.critical !== false, // Default to critical
      timeout: options.timeout || 5000,
      interval: options.interval || 30000,
      lastResult: null,
      lastCheck: null
    });

    // Start periodic checking
    if (options.interval) {
      setInterval(() => this.runCheck(name), options.interval);
    }
  }

  async runCheck(name) {
    const check = this.checks.get(name);
    if (!check) return null;

    const startTime = Date.now();
    
    try {
      const result = await Promise.race([
        check.fn(),
        new Promise((_, reject) => 
          setTimeout(() => reject(new Error('Health check timeout')), check.timeout)
        )
      ]);

      check.lastResult = {
        status: 'healthy',
        latency: Date.now() - startTime,
        ...result
      };
    } catch (error) {
      check.lastResult = {
        status: 'unhealthy',
        error: error.message,
        latency: Date.now() - startTime
      };
    }

    check.lastCheck = Date.now();
    return check.lastResult;
  }

  async runAllChecks() {
    const results = {};
    
    for (const [name, check] of this.checks) {
      results[name] = await this.runCheck(name);
    }

    return results;
  }

  getOverallStatus(checkResults) {
    const criticalChecks = Array.from(this.checks.entries())
      .filter(([_, check]) => check.critical)
      .map(([name]) => name);

    const hasUnhealthyCritical = criticalChecks.some(
      name => checkResults[name]?.status === 'unhealthy'
    );

    const hasAnyUnhealthy = Object.values(checkResults).some(
      r => r?.status === 'unhealthy'
    );

    if (hasUnhealthyCritical) return 'unhealthy';
    if (hasAnyUnhealthy) return 'degraded';
    return 'healthy';
  }

  // Express middleware for health endpoints
  createRouter() {
    const router = express.Router();

    // Liveness probe - is the process running?
    router.get('/healthz', (req, res) => {
      res.status(200).json({
        status: 'alive',
        uptime: Date.now() - this.startTime
      });
    });

    // Readiness probe - can we handle traffic?
    router.get('/ready', async (req, res) => {
      if (this.status !== 'ready') {
        return res.status(503).json({
          status: this.status,
          message: 'Service not ready'
        });
      }

      // Quick check of critical dependencies
      const criticalResults = {};
      for (const [name, check] of this.checks) {
        if (check.critical) {
          criticalResults[name] = check.lastResult;
        }
      }

      const hasUnhealthy = Object.values(criticalResults).some(
        r => r?.status === 'unhealthy'
      );

      if (hasUnhealthy) {
        return res.status(503).json({
          status: 'not_ready',
          checks: criticalResults
        });
      }

      res.status(200).json({ status: 'ready' });
    });

    // Deep health check - detailed status of all dependencies
    router.get('/health', async (req, res) => {
      const checkResults = await this.runAllChecks();
      const overallStatus = this.getOverallStatus(checkResults);

      const statusCode = overallStatus === 'healthy' ? 200 : 
                         overallStatus === 'degraded' ? 200 : 503;

      res.status(statusCode).json({
        status: overallStatus,
        timestamp: new Date().toISOString(),
        uptime: Date.now() - this.startTime,
        version: process.env.APP_VERSION || 'unknown',
        checks: checkResults
      });
    });

    return router;
  }

  setReady() {
    this.status = 'ready';
  }

  setNotReady(reason) {
    this.status = reason || 'not_ready';
  }
}

// Usage example
const healthService = new HealthCheckService();

// Register database check
healthService.registerCheck('database', async () => {
  const start = Date.now();
  await pool.query('SELECT 1');
  return { latency: Date.now() - start };
}, { critical: true, interval: 30000 });

// Register Redis check
healthService.registerCheck('redis', async () => {
  const start = Date.now();
  await redis.ping();
  return { latency: Date.now() - start };
}, { critical: true, interval: 30000 });

// Register external API check
healthService.registerCheck('payment-api', async () => {
  const response = await fetch('https://api.stripe.com/v1/health', {
    timeout: 5000
  });
  return { status: response.ok ? 'reachable' : 'unreachable' };
}, { critical: false, interval: 60000 });

// Mount health routes
app.use(healthService.createRouter());

// Mark service as ready after initialization
await initializeDatabase();
await warmUpCaches();
healthService.setReady();

module.exports = { HealthCheckService };

Load Balancer Patterns

Kubernetes Service Load Balancing

# kubernetes-lb.yaml

# ClusterIP Service (internal load balancing)
apiVersion: v1
kind: Service
metadata:
  name: user-service
spec:
  type: ClusterIP
  selector:
    app: user-service
  ports:
    - port: 80
      targetPort: 3000
  sessionAffinity: None  # or ClientIP for sticky sessions

---
# Headless Service (for client-side LB with service discovery)
apiVersion: v1
kind: Service
metadata:
  name: user-service-headless
spec:
  clusterIP: None
  selector:
    app: user-service
  ports:
    - port: 3000

---
# Deployment with health checks
apiVersion: apps/v1
kind: Deployment
metadata:
  name: user-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: user-service
  template:
    metadata:
      labels:
        app: user-service
    spec:
      containers:
        - name: user-service
          image: user-service:latest
          ports:
            - containerPort: 3000
          
          # Liveness probe - restart if fails
          livenessProbe:
            httpGet:
              path: /healthz
              port: 3000
            initialDelaySeconds: 10
            periodSeconds: 15
            failureThreshold: 3
            
          # Readiness probe - remove from LB if fails
          readinessProbe:
            httpGet:
              path: /ready
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10
            failureThreshold: 3
            
          # Startup probe - for slow starting containers
          startupProbe:
            httpGet:
              path: /healthz
              port: 3000
            failureThreshold: 30
            periodSeconds: 10

Envoy Proxy Configuration

# envoy-lb.yaml - Advanced L7 load balancing

static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          address: 0.0.0.0
          port_value: 8080
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                stat_prefix: ingress_http
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: backend
                      domains: ["*"]
                      routes:
                        - match:
                            prefix: "/api/users"
                          route:
                            cluster: user_service
                            timeout: 30s
                            retry_policy:
                              retry_on: "5xx,reset,connect-failure"
                              num_retries: 3
                              per_try_timeout: 10s

  clusters:
    - name: user_service
      connect_timeout: 5s
      type: STRICT_DNS
      lb_policy: LEAST_REQUEST  # Least connections
      
      # Circuit breaker
      circuit_breakers:
        thresholds:
          - priority: DEFAULT
            max_connections: 1000
            max_pending_requests: 1000
            max_requests: 1000
            max_retries: 3
            
      # Health checking
      health_checks:
        - timeout: 5s
          interval: 10s
          unhealthy_threshold: 3
          healthy_threshold: 2
          http_health_check:
            path: "/health"
            
      # Outlier detection (automatic ejection of unhealthy hosts)
      outlier_detection:
        consecutive_5xx: 5
        interval: 10s
        base_ejection_time: 30s
        max_ejection_percent: 50
        
      load_assignment:
        cluster_name: user_service
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: user-1.internal
                      port_value: 3000
              - endpoint:
                  address:
                    socket_address:
                      address: user-2.internal
                      port_value: 3000
              - endpoint:
                  address:
                    socket_address:
                      address: user-3.internal
                      port_value: 3000

Interview Questions

Q1: Client-side vs Server-side load balancing?

Answer:Server-side (e.g., NGINX, HAProxy):

Single point for routing
Simple clients
Extra network hop
Centralized control

Client-side (e.g., Ribbon, custom):

Client decides which server
No extra hop
More complex clients
Better for microservices

When to use each:

Server-side: External traffic, legacy clients
Client-side: Service-to-service within cluster
Hybrid: Edge LB + client-side internally

Q2: When would you use Consistent Hashing?

Answer:Use cases:

Cache servers (minimize cache misses on scale)
Session affinity without IP hash
Partitioned data (same key → same server)

How it works:

Servers and keys mapped to same hash ring
Key routed to next server clockwise
Adding/removing server affects only neighbors

Virtual nodes:

Multiple positions per server for balance
100-200 virtual nodes per physical server

Q3: What's the difference between liveness and readiness probes?

Answer:Liveness:

“Is the process stuck?”
Failure → container restart
Should be simple (no dependencies)
Example: Can the HTTP server respond?

Readiness:

“Can it handle traffic?”
Failure → remove from load balancer
Can check dependencies
Example: Is database connection ready?

Common mistake: Using deep checks for liveness causes cascading restarts when a dependency is down.

Q4: How does Power of Two Choices (P2C) work?

Answer:Simple but effective algorithm:

Pick 2 random servers
Choose the one with fewer connections

Why it works:

Avoids herd behavior (all clients picking same “best” server)
O(1) complexity (no sorting)
Statistical guarantees: max load ~log(log(n))

Used by: Envoy, HAProxy, Netflix ZuulBetter than round-robin because it considers actual load.

Chapter Summary

Key Takeaways:

Server-side LB for external traffic, client-side for internal
Algorithm choice depends on workload: Round-robin for simple, Least-connections for varying load
Consistent hashing for caching and stateful services
Implement both liveness and readiness probes
Health checks should have appropriate timeouts
Use circuit breakers with load balancing for resilience

Next Chapter: Migration Patterns - Strangler Fig, Branch by Abstraction, and more.

Overview

Testing & Code Quality

Crash Courses

AI Engineering

Math for ML - Understanding Linear Algebra

Probability & Statistics for ML

Math for ML - Understanding Calculus

ML Mastery

Deep Learning Mastery

NestJS Mastery

Microservices Mastery

Low Level Design

OOP Concepts

SOLID Principles

Design Patterns

LLD Case Studies

System Design (HLD)

Senior Level (L5+/Staff)

HLD Case Studies

Engineering Fundamentals

DevOps & Operations

Azure Cloud Engineering

AWS Cloud

AWS Monitoring & Observability

AWS Security Services

AWS Serverless

AWS Operations

AWS Advanced

AWS Case Studies

GCP Cloud Engineering

DevOps Tools

Database Engineering

HIPAA Compliance Mastery

Operating Systems

Linux Internals

Distributed Systems

Networking Mastery

Build Your Own X

Go Lang Mastery

C Programming

Classic Research Papers

Distributed System Tools

​Load Balancing Deep Dive

​Client-Side vs Server-Side Load Balancing

​Server-Side: NGINX Configuration

​Client-Side: Node.js Implementation

​Load Balancing Algorithms

​Advanced Algorithms Implementation

​Health Checking Strategies

​Comprehensive Health Check Service

​Load Balancer Patterns

​Kubernetes Service Load Balancing

​Envoy Proxy Configuration

​Interview Questions

​Chapter Summary