Real-World Stories: Patterns in the Wild
These are not hypothetical scenarios. These are billion-dollar architectural decisions that shaped the companies behind them — and the lessons apply whether you are building for ten users or ten million.Uber: The Monolith-to-Microservices Migration (and the Pain That Came With It)
Uber started, like most startups, as a monolith. A single Python application handled dispatch, payments, rider matching, and everything else. By 2014, that monolith was under extreme strain. Deployments were terrifying — a bug in the payment code could take down the entire dispatch system. Teams stepped on each other constantly. A single database became the bottleneck as Uber expanded to hundreds of cities. So Uber broke the monolith apart. Aggressively. By 2016, Uber had over 2,000 microservices. The result? They gained independent deployability and team autonomy, but they also inherited a sprawling distributed system that was enormously difficult to reason about. Debugging a single rider request meant tracing calls across dozens of services. Service-to-service failures cascaded in unexpected ways. The operational overhead was staggering — each service needed its own CI/CD pipeline, monitoring, alerting, and on-call rotation. Uber eventually invested heavily in platform infrastructure — building Jaeger for distributed tracing, adopting CQRS and event sourcing for ride-state management, and creating internal tools to manage service dependencies. The takeaway is not “microservices are bad” or “microservices are good.” It is that microservices are an organizational scaling solution, not a technical silver bullet, and that the infrastructure investment required to make them work is often underestimated by an order of magnitude.Amazon: Two-Pizza Teams and the Service-Oriented Architecture That Changed Everything
In the early 2000s, Amazon’s codebase was a tangled monolith that engineers called “the big ball of mud.” Jeff Bezos issued what became known as the “Bezos API Mandate” — a company-wide decree that all teams must expose their data and functionality through service interfaces, that all communication must happen through these interfaces, and that there would be no exceptions. The famous “two-pizza team” rule followed: every team should be small enough to be fed by two pizzas, and every team should own a service end-to-end. This was not a technical decision — it was an organizational one. Amazon realized that the bottleneck was not the code; it was the coordination cost between teams. By forcing service boundaries that aligned with team boundaries, they eliminated cross-team deployment dependencies. Each team could deploy independently, choose their own technology stack, and scale their service according to its specific load profile. The pattern that emerged — services owning their own data, communicating through well-defined APIs, teams organized around business capabilities — became the blueprint for what we now call microservices. But it is worth noting: Amazon had the engineering resources, the platform infrastructure, and the organizational maturity to make this work. They did not start with microservices; they evolved into them out of genuine organizational pain.Shopify: The Modular Monolith (Why They Chose NOT to Go Microservices)
While everyone else was rushing toward microservices around 2016-2018, Shopify made a deliberate, contrarian choice: they would stay on a monolith — but make it modular. Their core application is a large Ruby on Rails monolith that powers millions of merchants. Instead of breaking it into separate services, they introduced strict internal module boundaries, enforced through a tool called Packwerk that statically analyzes dependency violations between modules. Why? Shopify’s engineering leadership calculated the cost. They had hundreds of engineers working on the same codebase, and yes, that created friction. But the friction of a distributed system — network failures, eventual consistency, distributed tracing, service-to-service contract management — would have been worse. A modular monolith gave them the key benefit they needed (team autonomy through clear module ownership) without the operational tax of microservices. The result has been remarkably successful. Shopify handles massive traffic spikes (Black Friday/Cyber Monday) with a monolith. They deploy multiple times per day. They have clear team boundaries. And when a module genuinely needs to be extracted as a separate service (which has happened for a few performance-critical components), the clean module boundaries make that extraction straightforward. Shopify’s story is a powerful counter-narrative to the “microservices or bust” mentality — and a strong argument for the modular monolith as a default starting point.Stripe: The Repository Pattern at Scale for Multi-Database Support
Stripe processes billions of dollars in payments, and their data access needs are anything but simple. They use the Repository pattern extensively to abstract away the details of their storage layer. Behind a singlePaymentRepository interface, Stripe’s codebase can route queries to different databases depending on the context — a primary relational database for transactional writes, a read replica for analytics queries, a separate store for compliance and audit data.
This is the Repository pattern earning its keep at scale. When Stripe needed to migrate parts of their data layer from one database technology to another, the repository abstraction meant the migration was invisible to the hundreds of engineers writing business logic. They swapped the adapter behind the interface, ran both implementations in parallel during the migration window, and cut over without changing a single line of domain code. It is a textbook example of why the “unnecessary abstraction” crowd is wrong when the problem is complex enough: the Repository pattern’s value is not in day one simplicity, but in year-three flexibility when the storage landscape inevitably shifts under your feet.
Chapter 12: Code-Level Patterns
12.1 Strategy Pattern
Define a family of algorithms, encapsulate each, make them interchangeable. Replace if-else chains with interface implementations. Problem it solves: You have multiple algorithms or behaviors that differ only in implementation, and selecting between them with conditional logic creates brittle, growing if-else chains that violate the Open/Closed Principle. Real example: A payment processing service supports credit cards, PayPal, and bank transfers. Without Strategy, you get a giant if-else chain that grows with every new payment method. With Strategy, define aPaymentProcessor interface with a process(amount, details) method. Implement CreditCardProcessor, PayPalProcessor, BankTransferProcessor. The payment service receives the right processor via configuration or a factory. Adding Stripe? Add a new class. No existing code changes. The if-else chain becomes a lookup map.
When to use: Any time you have multiple algorithms or behaviors that should be selectable at runtime. Pricing strategies (flat rate, tiered, usage-based), notification channels (email, SMS, push), file export formats (CSV, JSON, PDF).
When NOT to use: When you only have two behaviors and it is unlikely a third will ever appear. A simple if-else is easier to read than an interface, two implementations, and a factory for something that will never change. Do not introduce strategy for the sake of it — wait until the if-else chain starts growing.
12.2 Repository Pattern
Abstract data access behind a collection-like interface. Decouples business logic from persistence. Enables testing with in-memory implementations. Problem it solves: Business logic becomes tangled with database queries, making it impossible to test domain rules without standing up a real database. Changes to the persistence layer ripple through the entire codebase. Real example: YourOrderRepository has methods like findById(id), findByCustomer(customerId), save(order), delete(id). Your business logic calls orderRepo.findByCustomer(id) without knowing whether data comes from PostgreSQL, MongoDB, or an in-memory cache. In tests, you swap in a InMemoryOrderRepository that stores orders in a simple array — no database needed, tests run in milliseconds.
When to use: When business logic is complex enough to benefit from isolation from persistence. Domain-driven design projects. Any time you want fast, reliable unit tests over domain logic.
When NOT to use: When you are building a simple CRUD app where the ORM already provides a clean enough interface. Adding a repository layer on top of an ORM that already abstracts the database can be unnecessary indirection. Use it when business logic is complex enough to benefit from isolation.
12.3 Factory Pattern
Encapsulate object creation. When creation logic is complex or varies by context, a factory centralizes it and hides the complexity from consumers. Problem it solves: Object creation logic is scattered across the codebase, duplicated, and inconsistent. Callers need to know too many details about which concrete class to instantiate and how to configure it. Real example: A notification system creates different notification objects based on type and user preferences. ANotificationFactory.create(type, user) method checks the user’s preferences, the notification type, the user’s timezone, and returns the right notification object fully configured. Without the factory, this creation logic is scattered across every caller, duplicated and inconsistent.
Analogy: The Factory pattern is like ordering food at a restaurant — you say WHAT you want (“I’ll have the salmon”), not HOW to make it (source the fish, season it, heat the grill to 400 degrees, cook for 6 minutes per side). The kitchen is the factory. You get back a finished dish without knowing or caring about the creation process. If the restaurant changes suppliers or cooking techniques, your ordering experience does not change. That is exactly what a factory does for object creation — it hides the “how” and lets callers focus on the “what.”
Variations: Simple Factory (a function that returns objects), Factory Method (subclasses decide which class to instantiate), Abstract Factory (creates families of related objects). In practice, the simple factory function is what you will use 90% of the time.
When to use: When object creation involves conditional logic, configuration, or multiple steps. When you want to decouple callers from concrete class names.
When NOT to use: When construction is trivial — new Thing(x, y) is perfectly fine. A factory for a single class with a simple constructor adds indirection for no gain.
12.4 Decorator Pattern
Add behavior to objects dynamically without modifying the original. Wrap a logging decorator around a repository to add logging without changing the repository. Problem it solves: You need to add cross-cutting behavior (logging, caching, metrics, retries) to existing objects without modifying their source code or creating an explosion of subclass combinations. Real example: You have aUserRepository that fetches users from the database. You need logging, caching, and metrics. Instead of modifying UserRepository, create wrappers: LoggingUserRepository wraps UserRepository and logs every call. CachingUserRepository wraps that and checks Redis before hitting the database. MetricsUserRepository wraps that and records timing. Each layer is independent, testable, and removable. The calling code sees the same interface.
In modern code: Decorators appear as middleware (Express, Koa), Python decorators (@cache, @retry), and higher-order functions. The pattern is everywhere even when not called by name.
When to use: When you need to compose behaviors around an object and want each behavior to be independently addable and removable. Middleware stacks, cross-cutting concerns, feature toggles.
When NOT to use: When deep nesting of decorators makes debugging a nightmare. If you find yourself wrapping 5+ layers deep and losing track of which decorator is responsible for which behavior, consider a different approach (like aspect-oriented programming or a pipeline pattern).
12.5 Observer Pattern
When one object changes, all dependents are notified. Foundation of event-driven programming. Used in UI frameworks, pub/sub systems, and reactive programming. Problem it solves: An object needs to notify an unknown, extensible set of other objects when its state changes, without being tightly coupled to them. Real example: An e-commerce system publishes anOrderPlaced event. The inventory service listens and reserves stock. The notification service listens and sends a confirmation email. The analytics service listens and updates dashboards. The order service does not know about any of these — it just publishes. Adding a loyalty points service means adding a new listener, not modifying the order service.
Trade-off: Loose coupling is the benefit. The cost is that the system’s behavior becomes harder to trace — “what happens when an order is placed?” requires checking all subscribers. Debugging a chain of events is harder than debugging a direct function call. Use event catalogs and tracing to manage this complexity.
When to use: When the set of “things that should react” will grow over time. UI state management, domain events, pub/sub messaging, webhook systems.
When NOT to use: When only one or two things need to react and the set is stable. A direct function call is simpler, more explicit, and easier to debug. Also avoid when ordering of notifications matters critically — observer does not guarantee execution order across listeners.
12.6 Adapter Pattern
Convert one interface to another. Wrap a third-party library so your code depends on your interface, not theirs. Essential for third-party dependency isolation. Problem it solves: Your code needs to work with a class or API whose interface does not match what your code expects. Or you want to insulate your codebase from third-party API changes and vendor lock-in. Real example: Your application uses Stripe for payments. Instead of calling Stripe’s SDK directly throughout your code, create aPaymentGateway interface that your code uses, and a StripePaymentGateway adapter that translates your interface calls into Stripe SDK calls. When the business decides to also support Adyen, you write an AdyenPaymentGateway adapter. Your application code does not change. When Stripe releases a breaking API change, only the adapter changes.
When it matters most: Third-party APIs (payment, email, SMS, cloud storage), legacy system integration, and any dependency you might need to swap. The adapter is your insulation layer.
When to use: Integrating with external services, wrapping legacy APIs, bridging incompatible interfaces during migrations.
When NOT to use: When you are wrapping an internal class you control. If you own both sides, just change the interface directly. Adapters for internal code add indirection without the vendor-isolation benefit.
Chapter 13: Architectural Patterns
13.1 Layered Architecture
Organize code into layers: Presentation → Business Logic → Data Access. Each layer only talks to the one below. Simple, well-understood, but can lead to unnecessary indirection for simple operations. Problem it solves: Without layering, presentation code directly queries databases, business rules live in UI handlers, and everything is tangled together. Changes in one area cascade unpredictably. When it works well: Most CRUD applications, team-based development where different teams own different layers, applications where the business logic is the most complex part. When it breaks down: When a simple “get user by ID” requires passing through 4 layers of indirection. When cross-cutting concerns (logging, auth, validation) do not fit neatly into one layer. When the “business logic” layer becomes a thin pass-through that just calls the data layer. When NOT to use: Highly event-driven systems, real-time streaming applications, or anything where the rigid top-to-bottom flow does not match the actual data flow of the system.13.2 Hexagonal Architecture (Ports and Adapters)
Business logic at the center, surrounded by ports (interfaces) and adapters (implementations). The core has no dependency on infrastructure — databases, APIs, and UIs are all adapters plugged in from outside. Makes the core independently testable. Problem it solves: In layered architecture, business logic often leaks into infrastructure concerns and vice versa. Testing business rules requires spinning up databases, HTTP servers, and message brokers. Hexagonal architecture enforces a hard boundary: the core is pure logic, everything else is pluggable. How it works — Ports and Adapters explained:- Ports are interfaces defined by the core. They represent what the core needs from the outside world (driven ports, e.g.,
OrderRepository,PaymentGateway) or what the outside world can ask of the core (driving ports, e.g.,PlaceOrderUseCase). - Adapters are implementations that connect ports to real infrastructure. A
PostgresOrderRepositoryadapter implements theOrderRepositoryport. AnExpressHttpAdapteradapter calls thePlaceOrderUseCaseport when an HTTP request arrives. - The dependency rule: Adapters depend on ports. The core depends on nothing external. Dependencies always point inward.
OrderService, PricingEngine, and domain models — pure business logic with no imports from frameworks, databases, or HTTP libraries. Ports define interfaces: OrderRepository (port for data access), PaymentGateway (port for payments), NotificationSender (port for notifications). Adapters implement those ports: PostgresOrderRepository, StripePaymentGateway, SendGridNotificationSender. In tests, swap in InMemoryOrderRepository, FakePaymentGateway. The core is 100% testable without any infrastructure.
Why it matters for testability: Because the core has zero infrastructure dependencies, you can test all business rules with fast, in-memory fakes. No database containers, no network mocks, no flaky integration tests for logic validation. Integration tests only need to verify that adapters correctly translate between the port interface and the real infrastructure — a much smaller, more focused surface area.
When to use: When business logic is complex and you need fast, reliable tests. When you expect to swap infrastructure (migrate databases, change cloud providers, replace third-party services). Domain-driven design projects.
When NOT to use: Simple CRUD apps with minimal business logic. If your “business logic” is just “take the request, validate it, save it to the database, return it,” hexagonal architecture adds ceremony without proportional benefit.
13.3 Clean Architecture
Similar to hexagonal — dependencies point inward. Entities at the center, use cases around them, interface adapters and frameworks on the outside. The dependency rule: inner circles know nothing about outer circles. The practical difference from hexagonal: Clean Architecture prescribes more specific layers (entities, use cases, interface adapters, frameworks) while hexagonal is more flexible with just “inside” and “outside.” In practice, most teams use a hybrid — the key principle is the same: business logic has zero dependencies on infrastructure.13.4 Event-Driven Architecture (EDA)
Systems structured around events rather than direct calls. Services publish events (OrderPlaced), others subscribe and react. The producer does not know or care who is listening. Problem it solves: Tight coupling between services. In a synchronous world, the order service must know about the inventory service, the notification service, and the analytics service — and call each of them. Adding a new reaction means modifying the order service. EDA inverts this. Why EDA is powerful: Adding a new reaction (send a loyalty points email when an order is placed) means adding a new consumer — zero changes to the order service. Services are independently deployable and scalable. Temporal decoupling — the consumer can be down temporarily and process events when it recovers. Trade-offs: Eventual consistency (the email is not sent at the same instant the order is placed — it is sent seconds later). Harder debugging (a user request triggers a chain of events across 5 services — you need distributed tracing to follow the flow). Event ordering challenges (ifOrderPlaced arrives after OrderShipped, your consumer logic must handle out-of-order events). Duplicate handling required (at-least-once delivery means every consumer must be idempotent).
13.5 CQRS (Command Query Responsibility Segregation)
Separate write model (optimized for consistency and business rules) from read model (optimized for query performance, denormalized). Scale reads and writes independently. Problem it solves: A single data model cannot be optimal for both writing (normalized, constrained, consistent) and reading (denormalized, fast, shaped for the UI). When read and write loads differ dramatically (most apps are read-heavy), a unified model forces you to compromise on both. How the read model gets populated: The write side persists data and publishes an event (or uses Change Data Capture). An event handler or projection builder listens for changes and updates the read model. The read model is a denormalized, query-optimized view — it may be in a different database (write side in PostgreSQL, read side in Elasticsearch for full-text search). The consistency window: After a write, the read model is stale until the projection catches up. This is usually milliseconds to seconds. Handle it in the UI: after a user creates an item, redirect them to the item using data from the write response (not the read model). Or use “read your own writes” — route the writing user’s reads to the primary for a short period. When CQRS without event sourcing is the right call: Most of the time. If you just need separate read and write models (e.g., normalized writes to PostgreSQL, denormalized reads from Redis or Elasticsearch), you do not need the complexity of event sourcing. CQRS + a simple CDC or event-publish-on-write is sufficient.13.6 Event Sourcing
Store the full history of state changes as events rather than just current state. Instead of storing “Order #123: status=shipped, total=50) → ItemAdded(Widget) → PaymentReceived($50) → OrderShipped. Problem it solves: Traditional state-based persistence throws away history. You know the current state but not how you got there. In domains where the “how” matters (finance, compliance, audit), this is a critical gap. How event replay works: To get the current state of an entity, read all its events from the event store (an append-only, ordered stream per aggregate) and replay them in order. Each event applies a state change. After replaying all events, you have the current state. This is powerful but slow for entities with thousands of events. Snapshots: To avoid replaying thousands of events on every read, periodically save a snapshot (the materialized state at a point in time). Then replay only events after the snapshot. Snapshot every N events (e.g., every 100) or on a schedule. Projections (read models): Event handlers that listen to the event stream and build query-optimized views. A “daily revenue” projection listens for PaymentReceived events and updates a running total. You can build new projections retroactively by replaying historical events — this is one of event sourcing’s strongest benefits. When event sourcing is genuinely the right choice: Audit-heavy domains (finance, healthcare, legal) where you must prove what happened and when. Systems where the history itself is valuable (undo/redo, temporal queries). Systems where you need to build new read models from historical data. When it is over-engineering: CRUD applications, simple data management, when you just need an audit log (use a changes table instead).Interview question: When would you choose event-driven architecture over a synchronous request-response model?
Interview question: When would you choose event-driven architecture over a synchronous request-response model?
Interview question: Explain CQRS and when you would — and would NOT — use it.
Interview question: Explain CQRS and when you would — and would NOT — use it.
Interview question: What are the trade-offs of event sourcing vs traditional state-based persistence?
Interview question: What are the trade-offs of event sourcing vs traditional state-based persistence?
Chapter 14: Microservices
14.1 What Microservices Are
Independently deployable services, each owning a specific business capability. Each has its own data store, its own deployment pipeline, and communicates with others through well-defined APIs or events. Analogy: Microservices are like independent food trucks vs. a single restaurant kitchen. Each food truck has its own menu, its own chef, its own supply chain, and can set up or shut down independently. That is powerful — a taco truck can upgrade its grill without affecting the sushi truck. But try coordinating a multi-course meal across five food trucks (appetizer from truck A, entree from truck B, dessert from truck C, all arriving at your table hot and in the right order) and you will immediately feel the coordination cost of distributed systems. A single restaurant kitchen handles that coordination trivially because everything is in one place. That is the monolith trade-off in a nutshell: easier coordination, harder independence. What “independently deployable” actually means: You can deploy a new version of the Order Service at 2 PM on Tuesday without deploying, testing, or even notifying the Payment Service team. If this is not true — if deploying one service requires coordinating with other teams — you do not have microservices, you have a distributed monolith. What “owns its data” actually means: The Order Service has its own database (or at minimum its own schema). No other service queries the Order tables directly. Other services get order data through the Order Service’s API or by consuming events it publishes. This is the hardest discipline in microservices and the most commonly violated.14.2 Benefits of Microservices
Independent deployment: Ship changes to the order service without touching the payment service. Deploy 10 times a day per service. Rollback one service without affecting others. Independent scaling: Scale the search service during peak traffic without scaling everything. Run the image processing service on GPU instances while the API runs on standard instances. Technology flexibility: Use Python for the ML service, Go for the high-throughput API, TypeScript for the BFF. Team autonomy: Each team owns their service end-to-end — they decide on the technology, the deployment schedule, and the internal architecture. Fault isolation: A crash in the review service does not bring down the checkout flow (if properly designed with circuit breakers and graceful degradation).14.3 Problems with Microservices (and Solutions)
Distributed system complexity. Network calls fail, latency is unpredictable, partial failures are normal. Solution: resilience patterns (retry, circuit breaker, timeout, bulkhead), async communication where possible. Data consistency. No distributed transactions. Each service owns its data. Solution: saga pattern for multi-service workflows, eventual consistency, outbox pattern for reliable event publishing. Service discovery. How does Service A find Service B? Solution: DNS-based discovery (Kubernetes services), service registries (Consul, Eureka), service mesh (Istio, Linkerd). Distributed tracing. A single user request flows through 5 services — how do you debug it? Solution: distributed tracing (Jaeger, Zipkin, AWS X-Ray, Azure Application Insights), correlation IDs propagated through all calls. Data duplication and joins. You cannot JOIN across service databases. Solution: each service maintains the data it needs (via events). API composition for queries that span services. CQRS with denormalized read models. Testing complexity. Integration testing across services is hard. Solution: contract testing (Pact), consumer-driven contracts, service virtualization, robust CI/CD per service. Operational overhead. Each service needs monitoring, alerting, deployment pipelines, log aggregation. Solution: platform team providing shared infrastructure, service mesh, standardized templates, internal developer platform. Network latency. Every service call adds network round-trip time. Solution: minimize synchronous call chains, use async communication, batch requests, use gRPC for internal communication (faster than REST).14.4 Key Microservices Patterns
API Gateway: Single entry point for external clients. Handles routing, authentication, rate limiting, request aggregation. Prevents clients from needing to know about individual services. Backend for Frontend (BFF): Separate API gateways for different client types (web, mobile, third-party). Each BFF aggregates and transforms data for its specific client’s needs. Outbox Pattern: To reliably publish events when data changes, write the event to an outbox table in the same transaction as the data change. A separate process reads the outbox and publishes to the message broker. Guarantees that events are published if and only if the data change committed. Pseudocode — outbox pattern:Saga Pattern (Deep Dive)
Manage distributed transactions as a sequence of local transactions with compensating actions. This is one of the most critical patterns in microservices — without it, multi-service workflows that require atomicity have no reliable coordination mechanism. Problem it solves: In a monolith, you wrap a multi-step operation in a database transaction. In microservices, there is no distributed transaction (and 2PC does not scale). The saga pattern provides eventual consistency across services by chaining local transactions with explicit undo steps. Concrete example — Order Processing Saga:- Order Service: Create order (status: pending)
- Payment Service: Charge customer → if fails, compensate: cancel order
- Inventory Service: Reserve items → if fails, compensate: refund payment, cancel order
- Shipping Service: Create shipment → if fails, compensate: release inventory, refund payment, cancel order
Choreography vs Orchestration
This is the most important decision when implementing sagas. Both are valid — the right choice depends on complexity and observability needs. Choreography — decentralized, event-driven: Each service publishes events and other services react. No central coordinator.- Order Service publishes
OrderCreated→ Payment Service listens, charges, publishesPaymentCharged→ Inventory Service listens, reserves, publishesInventoryReserved→ Shipping Service listens, ships. - If Inventory fails, it publishes
InventoryReservationFailed→ Payment Service listens and refunds → Order Service listens and cancels.
Strangler Fig Pattern: Gradually migrate from a monolith by routing specific functionality to new services while the monolith still handles everything else. Over time, the monolith shrinks until it is fully replaced.
14.5 Microservice Anti-Patterns
Know these — they come up in interviews and are common in real organizations: The Distributed Monolith: All services must be deployed together, share a database, or cannot function independently. You have all the complexity of microservices with none of the benefits. Symptom: “We can’t deploy the Order Service without also deploying the User Service.” Fix: Enforce independent deployability as a hard rule. Each service owns its data. Communication through APIs or events only. The Shared Database: Multiple services read and write the same database tables. Any schema change requires coordinating across all services. Symptom: “We need to update 5 services because we added a column to the users table.” Fix: Each service owns its tables. Other services access data through the owning service’s API. Duplicate data via events where needed. The God Service: One service that everything depends on (often called “common-service” or “core-service”). It becomes the bottleneck — every team needs changes in it, and it cannot be deployed without risking everything. Symptom: The god service has 50+ API endpoints and is modified in every sprint by 3 different teams. Fix: Decompose by business capability. If UserService handles user profiles, authentication, preferences, and billing — those are 4 services waiting to be extracted. Chatty Microservices: A single user request triggers a sequential chain of 5+ synchronous service calls. Latency compounds (5 services × 50ms = 250ms minimum). Failure in any one breaks the chain. Symptom: A product page takes 2 seconds because it calls 8 services sequentially. Fix: Aggregate data at the BFF (Backend for Frontend) layer. Use async communication where possible. Cache aggressively. Denormalize data so services have what they need locally. The Entity Service Trap: Splitting by data entity (UserService, OrderService, ProductService) instead of by business capability (Checkout, Catalog, Fulfillment). Entity services become CRUD wrappers with no business logic, and real business operations span multiple services. Fix: Design around business capabilities and use cases, not database tables.14.6 The Monolith-First Argument
Monolith: One deployment unit. Simple to develop, test, deploy. Right for most teams starting out. Modular monolith: Monolith with strict internal boundaries. Each module has its own models, data access, and clear interfaces. Simplicity of monolith with modularity for future extraction. Microservices: When you need independent deployment, independent scaling, technology diversity, or team autonomy at scale. The rule: Start with a modular monolith. Extract services only when you have a clear, measurable reason. When microservices are actually harmful:- Small teams (fewer than 20-30 engineers). The operational overhead of running, monitoring, and debugging distributed services exceeds the organizational benefit. A small team does not need independent deployment per team because they are one team.
- Early-stage products where the domain is not yet understood. Microservice boundaries are domain boundaries. If you do not yet know your domain well (the product is still pivoting, requirements shift weekly), you will draw the boundaries wrong. Refactoring across service boundaries is orders of magnitude harder than refactoring within a monolith. Get the boundaries right in a modular monolith first, then extract.
- When there is no platform/infrastructure team. Microservices require investment in CI/CD per service, centralized logging, distributed tracing, service discovery, and deployment orchestration. Without this foundation, each team reinvents the wheel and operational incidents multiply.
- When the team lacks distributed systems experience. Microservices introduce failure modes that do not exist in monoliths: network partitions, eventual consistency, message ordering, partial failures, distributed debugging. If the team has not dealt with these before, the learning curve during a production system build is costly.
Interview question: Your team is debating whether to start a new project with microservices. What is your recommendation?
Interview question: Your team is debating whether to start a new project with microservices. What is your recommendation?
Interview question: How do you handle a distributed transaction that spans three microservices?
Interview question: How do you handle a distributed transaction that spans three microservices?
Interview question: You're designing an e-commerce checkout. Payment, inventory, and shipping are separate services. How do you ensure consistency? Walk me through the Saga pattern.
Interview question: You're designing an e-commerce checkout. Payment, inventory, and shipping are separate services. How do you ensure consistency? Walk me through the Saga pattern.
- Create Order — the Order Service creates an order in
pendingstatus. This is the starting point and the orchestrator records that step 1 succeeded. - Reserve Inventory — the orchestrator calls the Inventory Service to reserve the items. If this fails (out of stock), we cancel the order immediately. No payment was taken, so no compensation needed beyond updating the order status to
cancelled. - Process Payment — the orchestrator calls the Payment Service to charge the customer. If this fails (declined card), the compensating action is to release the inventory reservation, then cancel the order.
- Initiate Shipping — the orchestrator calls the Shipping Service to create a shipment. If this fails, we refund the payment, release inventory, and cancel the order.
Interview question: Your team wants to adopt microservices. You have 5 engineers and a 6-month-old product. What do you advise and why?
Interview question: Your team wants to adopt microservices. You have 5 engineers and a 6-month-old product. What do you advise and why?
Interview question: Show me how you'd refactor a God class using the Strategy pattern. What's the first step?
Interview question: Show me how you'd refactor a God class using the Strategy pattern. What's the first step?
ReportGenerator class with a 500-line generate() method containing a giant if-else chain: if format == 'pdf' does one thing, elif format == 'csv' does another, elif format == 'excel' does a third, and so on. Every new format means adding another branch, and the class has become a dumping ground for unrelated formatting logic.The first step — and this is critical — is not to start extracting strategies. The first step is to write characterization tests. I need tests that capture the current behavior of each branch, so I can refactor with confidence that I am not breaking anything. I would write a test for PDF output, a test for CSV output, and a test for Excel output, each asserting on the actual output the current code produces.With tests in place, step two is to define the Strategy interface. Something like:PdfReportFormatter. Move the PDF logic out of the if-else branch and into this class. Run the tests. Green? Move to the next one. CsvReportFormatter. Run tests. ExcelReportFormatter. Run tests. Each extraction is a small, safe step.Step four: replace the if-else chain with a lookup map:ReportGenerator is now a thin coordinator. Adding a new format means adding a new class and one entry in the map — no existing code changes.The key insight is that each step is independently committable and deployable. At no point did I do a big-bang rewrite. If I get pulled onto an incident after step three, the code is in a better state than when I started.”Common mistakes: Jumping straight to the end state without describing the incremental steps. Forgetting to mention tests as the first step. Describing the pattern in the abstract without a concrete example. Not explaining why the God class is problematic in the first place (violates Open/Closed Principle, single class changing for multiple reasons).Words that impress: Characterization tests, incremental extraction, Open/Closed Principle, each step is independently deployable, lookup map replacing conditional logic, thin coordinator.Pattern Selection Guide
Use this table when choosing between patterns. Match your problem to the pattern, and weigh the trade-off honestly.| Problem | Pattern | Trade-off |
|---|---|---|
| Multiple algorithms selectable at runtime (e.g., payment methods, pricing tiers) | Strategy | Adds interface + implementations per algorithm; overkill for 1-2 static behaviors |
| Business logic tangled with database code; need testable domain layer | Repository | Extra abstraction layer; unnecessary if ORM already provides clean separation |
| Complex or conditional object creation scattered across callers | Factory | Centralizes creation but hides what is being created; can obscure debugging |
| Need to add cross-cutting behavior (logging, caching, metrics) without modifying existing code | Decorator | Each layer adds indirection; deeply nested decorators are hard to debug |
| Unknown, extensible set of reactors to a state change | Observer / Event-Driven | Loose coupling at the cost of traceability; debugging event chains is hard |
| Insulate code from third-party API changes and vendor lock-in | Adapter | Extra wrapper layer; unnecessary for internal code you control |
| Simple app with clear layers (presentation, business, data) | Layered Architecture | Pass-through layers become ceremony; cross-cutting concerns do not fit neatly |
| Complex domain logic that must be testable without infrastructure | Hexagonal Architecture | More up-front structure; overhead not justified for simple CRUD |
| Read and write loads differ dramatically; need different query shapes | CQRS | Two models to maintain, eventual consistency to reason about; overkill for simple CRUD |
| Audit trail, history, temporal queries, retroactive projections | Event Sourcing | Schema evolution is hard, replay is slow at scale, storage grows unbounded |
| Multi-service workflow requiring atomicity without distributed transactions | Saga (Orchestration) | Orchestrator complexity; compensating transactions must be carefully designed |
| Simple 2-3 service reactive workflow | Saga (Choreography) | No central visibility; hard to answer “what state is this saga in?” |
| Gradual migration from monolith to services | Strangler Fig | Dual running costs during migration; routing complexity at the boundary |
| Reliable event publishing tied to data changes | Outbox Pattern | Extra table + relay process; operational overhead of polling or CDC setup |
| Many teams, independent deploy/scale needs, mature platform | Microservices | Distributed system complexity; harmful for small teams or unclear domains |
| Small team, evolving domain, speed of iteration priority | Modular Monolith | Must enforce boundaries with discipline; extraction to services requires later effort |
Curated Resources
These are not “further reading for completeness.” These are the resources that will genuinely move your understanding forward, organized by what you will get from each one.Foundational References
- Martin Fowler — Patterns of Enterprise Application Architecture (articles) — The free online catalog from Fowler’s seminal book. Each pattern (Repository, Unit of Work, Data Mapper, Active Record, and dozens more) gets a concise explanation with diagrams. This is the vocabulary that senior engineers use when discussing data access and enterprise architecture. Start with Repository, Unit of Work, and Domain Model — those three appear in almost every design discussion.
- Refactoring.guru — Design Patterns — The best free visual catalog of design patterns available. Every pattern includes intent, motivation, structure diagrams, pseudocode, real-world analogies, and examples in multiple languages. If you learn better visually, this is your primary resource. The “Relations between patterns” section for each pattern is especially valuable — it shows when patterns complement each other and when one can substitute for another.
- Microsoft — Cloud Design Patterns — Despite the Azure branding, these are cloud-agnostic architectural patterns with exceptional depth. The Saga pattern, Circuit Breaker, CQRS, Event Sourcing, Strangler Fig, Ambassador, Sidecar — each has a detailed write-up with problem context, solution mechanics, when to use, and when not to use. This is the single best free resource for architectural patterns in distributed systems.
Books That Shift Your Thinking
- Building Microservices (2nd Edition) by Sam Newman — The definitive practical guide to microservices architecture. Newman is honest about trade-offs (the chapter on “should you even do microservices?” is worth the book alone). Key concepts to focus on: service decomposition strategies, data ownership, the monolith-first approach, and migration patterns. The second edition (2021) reflects lessons the industry learned the hard way since the microservices hype of 2015.
- Designing Data-Intensive Applications by Martin Kleppmann — Not a patterns book per se, but the best book on understanding the data systems that underpin every architectural pattern discussed here. If you want to truly understand why event sourcing has the trade-offs it does, or what eventual consistency really means at the database level, this is where you go. Chapters 5 (Replication), 7 (Transactions), and 11 (Stream Processing) are directly relevant to every pattern in this module.
Engineering Blogs for Real-World Application
- Uber Engineering Blog — CQRS and Domain Events — Uber’s engineering blog documents their journey through event-driven architecture, CQRS, and event sourcing at massive scale. Search for posts on their domain event platform and how they handle ride-state management. These are not theoretical discussions — they are battle reports from running these patterns with millions of concurrent users.
- Shopify Engineering — Deconstructing the Monolith — Shopify’s detailed explanation of their modular monolith approach, including how they use Packwerk for boundary enforcement, why they chose this path over microservices, and the concrete results. Essential reading for anyone considering (or being pressured toward) a microservices migration.
- ThoughtWorks Technology Radar — Published twice yearly, the Technology Radar tracks which patterns, tools, and techniques are being adopted, trialed, assessed, or put on hold across the industry. Check the “Techniques” quadrant for pattern trends. This is how you stay current on what the industry is learning about CQRS, event sourcing, modular monoliths, and architecture decision records.
Pattern Recognition in Interviews
The hardest part of pattern knowledge is not memorizing the patterns — it is recognizing when they apply. In interviews, the interviewer will rarely say “use the Strategy pattern here.” Instead, they will describe a problem, and your job is to hear the signal and reach for the right tool. This table maps common interviewer phrases and problem descriptions to the patterns they are testing.| When the interviewer says… | Consider this pattern | Why it fits |
|---|---|---|
| ”Different behavior based on type” / “The logic changes depending on the mode” / “We need to support multiple algorithms” | Strategy | Varying behavior behind a common interface — the classic strategy signal |
| ”We might switch vendors” / “What if we need to support a different payment provider?” / “How do you isolate third-party dependencies?” | Adapter | Vendor isolation through an interface that shields your code from external API changes |
| ”How would you add logging/caching/metrics without changing existing code?” / “Cross-cutting concerns” | Decorator | Composable behavior wrapping — each concern is an independent, removable layer |
| ”The object creation is complex” / “Different configurations depending on the environment” / “How do you avoid scattering new() calls?” | Factory | Centralized, encapsulated object creation that hides conditional construction logic |
| ”Multiple services need to react when this happens” / “We need to add new reactions without modifying the source” | Observer / Event-Driven Architecture | Decoupled fan-out where the producer does not know or care about consumers |
| ”How do you keep business logic testable without a database?” / “Separate domain logic from infrastructure” | Repository + Hexagonal Architecture | Abstracted data access (Repository) within a ports-and-adapters structure (Hexagonal) |
| “Read traffic is 100x write traffic” / “The dashboard query is killing the database” / “Reads need a different shape than writes” | CQRS | Separate read/write models optimized for their respective access patterns |
| ”We need a complete audit trail” / “What was the state at this point in time?” / “We want to replay history” | Event Sourcing | Immutable event stream that preserves full history and enables temporal queries |
| ”This workflow spans three services” / “How do you handle a distributed transaction?” / “What if step 3 fails?” | Saga (Orchestration or Choreography) | Coordinated multi-service workflow with compensating transactions for failure recovery |
| ”We want to migrate off the monolith gradually” / “We cannot rewrite everything at once” | Strangler Fig | Incremental migration via routing — new functionality goes to new services, old monolith shrinks |
| ”The data change and the event must be consistent” / “Sometimes events get lost” | Outbox Pattern | Atomic write of data + event in the same transaction, with a relay process for publishing |
| ”We have 200 engineers and deployments take a week because everyone is coupled” | Microservices | Independent deployment and team autonomy at organizational scale |
| ”We are a team of 8 and need clean boundaries without distributed system overhead” | Modular Monolith | Internal module boundaries with the operational simplicity of a single deployment |
| ”Requests keep failing because one downstream service is slow” / “Cascading failures” | Circuit Breaker (covered in depth in Reliability chapters) | Fail fast when a dependency is unhealthy, preventing cascade failures |
| ”Every service implements its own retry/auth/logging differently” | Sidecar / Service Mesh | Standardized cross-cutting infrastructure as a separate process alongside each service |
| ”The frontend calls 6 different backends” / “Mobile needs smaller payloads than web” | API Gateway / BFF | Unified entry point (Gateway) or client-specific aggregation layer (BFF) |