Skip to main content

Interview Preparation

Master the most common microservices interview questions asked at top tech companies.
What This Chapter Covers:
  • Common interview questions with answers
  • System design exercises
  • Behavioral questions about microservices
  • Whiteboard coding challenges
  • Tips for success

Common Interview Questions

Architecture & Design

Choose Microservices when:
  • Team is large (multiple teams need autonomy)
  • Different parts need different scaling
  • Need technology diversity
  • Complex domain with clear boundaries
  • Organization is ready for DevOps culture
Choose Monolith when:
  • Small team (< 10 developers)
  • Simple domain
  • Startup/MVP phase
  • Unclear boundaries
  • Limited DevOps expertise
Key Insight: Start with a well-structured monolith, extract services when needed. Premature microservices is a common mistake.
Options:
  1. Saga Pattern (Preferred)
    • Choreography: Events trigger compensation
    • Orchestration: Central coordinator manages
  2. Event Sourcing
    • Store events, not state
    • Replay for consistency
  3. Two-Phase Commit (Avoid)
    • Blocking, doesn’t scale
    • Single point of failure
Example (Choreography Saga):
Order Created → Payment Charged → Inventory Reserved → Order Confirmed
                     ↓ (failure)
             Refund Payment → Release Inventory → Cancel Order
Best Practice: Design for eventual consistency, use compensation over rollback.
Strategies:
  1. Eventual Consistency
    • Accept temporary inconsistency
    • Design idempotent operations
    • Use event-driven updates
  2. Outbox Pattern
    • Write to DB + outbox in same transaction
    • Separate process publishes events
    • Guarantees at-least-once delivery
  3. Change Data Capture (CDC)
    • Listen to database changes
    • Publish events from DB logs
    • Example: Debezium
Key Points:
  • Avoid distributed transactions
  • Design for failure recovery
  • Monitor for inconsistencies
Step-by-Step Approach:
  1. Identify Boundaries
    • Use Domain-Driven Design
    • Find bounded contexts
    • Look for natural seams
  2. Start with Edge Services
    • Authentication
    • Notifications
    • Low-risk, well-defined
  3. Strangler Fig Pattern
    • Route traffic through facade
    • Gradually extract functionality
    • No big bang migration
  4. Database Extraction
    • Identify service data
    • Create new database
    • Sync during transition
    • Switch reads, then writes
Common Mistake: Extracting services before understanding domain boundaries.
CAP Theorem:
  • Consistency: All nodes see same data
  • Availability: Every request gets response
  • Partition Tolerance: System works despite network failures
Reality: You must choose 2 of 3 during partitions:
  • CP (Consistency + Partition): Reject requests until consistent (e.g., banking)
  • AP (Availability + Partition): Accept requests, sync later (e.g., shopping cart)
In Microservices:
  • Networks will fail → must handle partitions
  • Usually choose AP with eventual consistency
  • Use compensation for errors
Example:
  • Payment: CP - never double charge
  • Inventory display: AP - show slightly stale data

Communication Patterns

Synchronous (REST, gRPC):
  • Need immediate response
  • Query operations
  • Simple request-reply
  • Real-time requirements
Asynchronous (Events, Messages):
  • Fire and forget
  • Long-running operations
  • Decouple services
  • Handle spikes/backpressure
Hybrid Approach:
User → API (sync) → Order Service
                         ↓ (async)
                    Payment Event

                   Payment Service
                         ↓ (async)
                    Order Updated Event
Best Practice: Default to async, use sync only when necessary.
Strategies:
  1. URL Versioning: /api/v1/orders
  2. Header Versioning: Accept: application/vnd.api+json; version=1
  3. Query Parameter: /orders?version=1
Best Practices:
  • Support N-1 versions minimum
  • Deprecation warnings in responses
  • Clear migration documentation
  • Use semantic versioning
Breaking vs Non-Breaking:
  • Breaking: Remove field, change type, remove endpoint
  • Non-Breaking: Add optional field, new endpoint
Contract Testing: Catch breaking changes before deployment.
Algorithms:
  1. Token Bucket
    • Tokens added at fixed rate
    • Request consumes token
    • Allows bursts
  2. Sliding Window
    • Count requests in time window
    • More accurate than fixed window
  3. Leaky Bucket
    • Fixed output rate
    • Queue excess requests
Implementation:
// Redis-based rate limiter
const limit = await redis.incr(`ratelimit:${userId}`);
if (limit === 1) {
  await redis.expire(`ratelimit:${userId}`, 60);
}
if (limit > 100) {
  return res.status(429).json({ error: 'Rate limit exceeded' });
}
Headers: X-RateLimit-Limit, X-RateLimit-Remaining, Retry-After

Resilience & Reliability

Defense Layers:
  1. Circuit Breaker: Fail fast, don’t wait
  2. Retry with Backoff: Handle transient failures
  3. Fallback: Cached data or default response
  4. Timeout: Don’t wait forever
  5. Bulkhead: Isolate failure impact
Example Flow:
Request → Circuit Breaker (open?) 
              ↓ no
          Timeout (5s)
              ↓ success
          Return response
              ↓ failure
          Retry (3 attempts, exponential backoff)
              ↓ still failing
          Open circuit breaker

          Return fallback
Key: Graceful degradation, not complete failure.
Tools & Techniques:
  1. Distributed Tracing (Jaeger, Zipkin)
    • Trace requests across services
    • Identify bottlenecks
    • Find error source
  2. Centralized Logging (ELK, Loki)
    • Correlation IDs across logs
    • Structured logging (JSON)
    • Searchable logs
  3. Metrics (Prometheus, Grafana)
    • RED metrics: Rate, Errors, Duration
    • Dashboards for visibility
    • Alerting on anomalies
Debugging Flow:
  1. Check dashboards for anomalies
  2. Find trace ID from failed request
  3. Follow trace through services
  4. Search logs with correlation ID
  5. Identify root cause

System Design Exercises

Exercise 1: Design an E-Commerce Order System

┌─────────────────────────────────────────────────────────────────────────────┐
│                    SYSTEM DESIGN: E-COMMERCE ORDERS                          │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  Requirements:                                                               │
│  • Handle 10,000 orders/hour peak                                           │
│  • 99.9% availability                                                       │
│  • Payment processing (3rd party)                                           │
│  • Inventory management                                                     │
│  • Order tracking                                                           │
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────┐    │
│  │                         Architecture                                 │    │
│  │                                                                      │    │
│  │  ┌────────┐    ┌─────────────┐    ┌───────────────────────────┐    │    │
│  │  │  CDN   │───▶│   API GW    │───▶│       Services            │    │    │
│  │  └────────┘    │ (Kong/NGINX)│    │  ┌─────────────────────┐  │    │    │
│  │                └─────────────┘    │  │   Order Service     │  │    │    │
│  │                       │           │  │   (PostgreSQL)      │  │    │    │
│  │                       ▼           │  └─────────────────────┘  │    │    │
│  │                ┌───────────┐      │  ┌─────────────────────┐  │    │    │
│  │                │   Redis   │      │  │   Payment Service   │  │    │    │
│  │                │  (Cache)  │      │  │   (Stripe/Adyen)    │  │    │    │
│  │                └───────────┘      │  └─────────────────────┘  │    │    │
│  │                       │           │  ┌─────────────────────┐  │    │    │
│  │                       ▼           │  │  Inventory Service  │  │    │    │
│  │                ┌───────────┐      │  │   (PostgreSQL)      │  │    │    │
│  │                │   Kafka   │◀────▶│  └─────────────────────┘  │    │    │
│  │                │  (Events) │      │  ┌─────────────────────┐  │    │    │
│  │                └───────────┘      │  │ Notification Service│  │    │    │
│  │                                   │  └─────────────────────┘  │    │    │
│  │                                   └───────────────────────────┘    │    │
│  └─────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
│  Key Design Decisions:                                                       │
│  • Saga for order workflow (compensating transactions)                       │
│  • Event-driven for inventory updates                                       │
│  • CQRS for order queries (separate read model)                            │
│  • Idempotency keys for payment retry safety                               │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘
Talking Points:
  1. Start with requirements clarification
  2. Estimate scale (10K orders/hour = ~3 orders/second)
  3. Identify service boundaries using DDD
  4. Explain saga pattern for order workflow
  5. Discuss failure scenarios and handling
  6. Address scaling (horizontal scaling of stateless services)
  7. Mention observability approach

Exercise 2: Design URL Shortener

Requirements:
  • 100M URLs/month
  • Redirect latency < 100ms
  • 5-year data retention
Key Points:
  • Read-heavy workload (100:1 read/write)
  • Cache heavily (Redis)
  • Generate short codes (Base62)
  • Distributed ID generation
  • Consistent hashing for distribution

Exercise 3: Design Notification Service

Requirements:
  • Multi-channel (email, SMS, push)
  • Template support
  • Delivery guarantees
  • Rate limiting
Key Points:
  • Message queue for reliability
  • Channel-specific workers
  • Dead letter queue for failures
  • Priority queues
  • Idempotency for retries

Behavioral Questions

STAR Format:Situation: “Payment service started timing out during Black Friday peak.”Task: “I was on-call and needed to restore service quickly.”Action:
  • Checked dashboards, saw 95th percentile latency spike
  • Identified database connection pool exhaustion
  • Temporary: Increased connection pool, added more replicas
  • Long-term: Implemented connection pooling with PgBouncer
Result:
  • Service restored in 15 minutes
  • Added connection pool monitoring
  • Implemented load shedding for future peaks
Focus on:
  • Why migration was needed
  • Planning and preparation
  • Strangler fig pattern usage
  • Data migration strategy
  • Rollback plan
  • Lessons learned
Example Answer: “We migrated auth from monolith. Used strangler pattern - new auth service behind same API. Ran in parallel for 2 weeks, comparing responses. Gradual traffic shift. Had to handle session migration carefully. Key learning: comprehensive feature flags for quick rollback.”

Quick Reference Card

┌─────────────────────────────────────────────────────────────────────────────┐
│                    MICROSERVICES INTERVIEW CHEAT SHEET                       │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  PATTERNS TO KNOW:                                                           │
│  ─────────────────                                                          │
│  • API Gateway          • Circuit Breaker      • Event Sourcing             │
│  • Service Discovery    • Saga Pattern         • CQRS                       │
│  • Database per Service • Outbox Pattern       • Strangler Fig              │
│                                                                              │
│  COMMUNICATION:                                                              │
│  ──────────────                                                              │
│  Sync: REST, gRPC       Async: Kafka, RabbitMQ, Events                      │
│                                                                              │
│  DATA CONSISTENCY:                                                           │
│  ─────────────────                                                          │
│  • Eventual consistency (preferred)                                         │
│  • Saga for distributed transactions                                        │
│  • Idempotency for retries                                                  │
│                                                                              │
│  RESILIENCE:                                                                 │
│  ───────────                                                                 │
│  Circuit Breaker → Retry → Timeout → Fallback → Bulkhead                   │
│                                                                              │
│  OBSERVABILITY:                                                              │
│  ──────────────                                                              │
│  Logs + Metrics + Traces = Complete visibility                              │
│  RED: Rate, Errors, Duration                                                │
│                                                                              │
│  COMMON PITFALLS:                                                            │
│  ────────────────                                                           │
│  • Distributed monolith (too coupled)                                       │
│  • Wrong service boundaries                                                 │
│  • Ignoring network failures                                                │
│  • Premature microservices                                                  │
│                                                                              │
│  INTERVIEW TIPS:                                                             │
│  ───────────────                                                            │
│  1. Always clarify requirements first                                       │
│  2. Start simple, add complexity as needed                                  │
│  3. Discuss trade-offs explicitly                                           │
│  4. Mention failure scenarios                                               │
│  5. Reference real experience                                               │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Interview Tips

Do's

  • Clarify requirements upfront
  • Think out loud
  • Discuss trade-offs
  • Mention failure scenarios
  • Draw diagrams
  • Reference real experience
  • Ask good questions

Don'ts

  • Jump to solution immediately
  • Over-engineer simple problems
  • Ignore scale requirements
  • Forget about data consistency
  • Skip error handling discussion
  • Claim expertise you don’t have
  • Dismiss simpler solutions

Summary

Key Interview Themes

  1. Architecture: Know when to use microservices and how to design boundaries
  2. Data: Understand eventual consistency, sagas, and CQRS
  3. Resilience: Circuit breakers, retries, fallbacks are essential
  4. Communication: Know sync vs async trade-offs
  5. Observability: Logs, metrics, traces - you need all three
  6. Experience: Have real examples ready to share

Next Steps

Practice

Work through the capstone project to apply everything you’ve learned.

Capstone Project

Build a complete e-commerce microservices system from scratch.