Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

Observability in Microservices

When you have 50 services, “tailing the logs” is impossible. You need centralized observability.

1. The Three Pillars

  1. Logs: Immutable record of discrete events. (“Error at 10:00 PM”).
  2. Metrics: Aggregated data over time. (“CPU usage is 80%”, “Requests per second: 50”).
  3. Tracing: The path of a single request across multiple services.

2. Distributed Tracing with Zipkin/Micrometer

Spring Boot 3 uses Micrometer Tracing (formerly Spring Cloud Sleuth). Dependencies:
  • io.micrometer:micrometer-tracing-bridge-brave
  • io.zipkin.reporter2:zipkin-reporter-brave
How it works Every request gets a unique Trace ID (global) and Span ID (local). These IDs are propagated via HTTP headers (traceparent).
Running Zipkin:
docker run -d -p 9411:9411 openzipkin/zipkin
Config (application.yml):
management:
  tracing:
    sampling:
      probability: 1.0 # Sample 100% of requests (Don't do this in prod!)
Now, when you hit Order Service, which calls Inventory Service, you can see the full timeline in Zipkin UI (http://localhost:9411).

3. Metrics with Prometheus & Grafana

Actuator exposes metrics at /actuator/metrics. Prometheus scrapes them. Dependency: io.micrometer:micrometer-registry-prometheus. Config:
management:
  endpoints:
    web:
      exposure:
        include: prometheus
Prometheus Config (prometheus.yml):
scrape_configs:
  - job_name: 'spring_micrometer'
    metrics_path: '/actuator/prometheus'
    static_configs:
      - targets: ['host.docker.internal:8080']
Grafana: Connect Grafana to Prometheus and import a standard Spring Boot Dashboard (ID: 4701). You’ll get instant graphs for JVM memory, GC pauses, and HTTP throughput.

4. Centralized Logging (ELK / Loki)

Don’t write logs to files. Write to Console (STDOUT). Use a log shipper (Fluentd/Promtail) to send them to ElasticSearch or Loki. Lombok Logging:
@Slf4j
@Service
public class OrderService {
    public void createOrder() {
        log.info("Creating order..."); // Automatically includes Trace ID and Span ID
    }
}
Prometheus scrapes /actuator/prometheus every 15s. Grafana visualizes the data.

5. Deep Dive: Spring Boot Actuator

Actuator exposes operational information about your running application.

Enabling Actuator

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
By default, most endpoints are disabled for security. Enable All:
management:
  endpoints:
    web:
      exposure:
        include: "*" # WARNING: Don't do this in production without security

Key Endpoints

EndpointDescription
/actuator/healthApplication health (UP/DOWN). Includes DB, Disk, etc.
/actuator/infoApplication metadata (version, Git commit).
/actuator/metricsAll available metrics.
/actuator/metrics/{name}Specific metric (e.g., jvm.memory.used).
/actuator/envEnvironment properties.
/actuator/loggersView/Change log levels at runtime.
/actuator/prometheusPrometheus-formatted metrics.

Securing Actuator

@Configuration
public class SecurityConfig {
    @Bean
    public SecurityFilterChain actuatorSecurity(HttpSecurity http) throws Exception {
        http.authorizeHttpRequests(auth -> auth
                .requestMatchers("/actuator/**").hasRole("ADMIN")
                .anyRequest().authenticated()
        );
        return http.build();
    }
}

6. Custom Metrics with Micrometer

Track your own business KPIs.
@Service
@RequiredArgsConstructor
public class OrderService {

    private final MeterRegistry meterRegistry;

    public void placeOrder(Order order) {
        // Increment counter
        meterRegistry.counter("orders.placed", "status", "success").increment();
        
        // Record time
        Timer.Sample sample = Timer.start(meterRegistry);
        processOrder(order);
        sample.stop(meterRegistry.timer("order.processing.time"));
        
        // Gauge (current value)
        meterRegistry.gauge("orders.pending", getPendingOrderCount());
    }
}
Metric Types:
  • Counter: Monotonically increasing (e.g., requests served).
  • Gauge: Current value (e.g., active connections).
  • Timer: Duration of events (e.g., request latency).
  • Distribution Summary: Statistical summary (e.g., request size).

Monitoring Flow (Complete Picture)