Skip to main content

Resilience Patterns

In distributed systems, failures are inevitable. Networks lag, services crash. Your system must remain responsive even when parts of it are broken. We use Resilience4j, a lightweight fault tolerance library designed for Java 8 and functional programming.

1. The Circuit Breaker Pattern

If Inventory Service is slow or down, Order Service keeps waiting for it, consuming threads. Eventually, Order Service runs out of threads and crashes too (Cascading Failure). A Circuit Breaker detects failures and “opens the circuit”, failing fast immediately without waiting for the timeout, giving the downstream service time to recover. States:
  1. CLOSED: Normal operation. Requests pass through.
  2. OPEN: Too many failures. Requests fail immediately.
  3. HALF-OPEN: Testing if the service is back online. Lets a few requests through.

2. Implementation

add dependency: spring-cloud-starter-circuitbreaker-resilience4j.
@Service
public class OrderService {

    private final ProductClient productClient;

    @CircuitBreaker(name = "productService", fallbackMethod = "fallbackProduct")
    public ProductDto getProduct(Long id) {
        return productClient.getProduct(id);
    }

    // Fallback method must have same signature + Exception
    public ProductDto fallbackProduct(Long id, Throwable t) {
        // Return a default product or cached version
        return new ProductDto(id, "Default Product", 0.0);
    }
}
Configuration (application.yml):
resilience4j:
  circuitbreaker:
    instances:
      productService:
        registerHealthIndicator: true
        slidingWindowSize: 10 # Check last 10 calls
        failureRateThreshold: 50 # If 50% fail, open circuit
        waitDurationInOpenState: 5s # Wait 5s before trying again (Half-open)

3. Retry Pattern

For transient failures (temporary network blip), it makes sense to try again.
@Retry(name = "productService")
public ProductDto getProduct(Long id) {
    return productClient.getProduct(id);
}
Config:
resilience4j:
  retry:
    instances:
      productService:
        maxAttempts: 3
        waitDuration: 1s

4. Rate Limiting

Prevent one user or service from overwhelming your system.
@RateLimiter(name = "standard")
public String limit() {
    return "You are within limits";
}

5. Bulkhead Pattern

Isolate resources. If one part of the system is exhausted, others shouldn’t be affected. It creates separate thread pools for different calls. If the “Image Processing” thread pool is full, the “User Login” thread pool still works fine.
Summary:
  • Circuit Breaker: Stop calling a dead service.
  • Retry: Try again for temporary glitches.
  • Rate Limiter: Control traffic flow.
  • Bulkhead: Isolate failures.