API = How one application talks to another applicationMessaging = Sending messages between applications asynchronouslyThink of it like communication methods:API (Synchronous) = Phone Call
You: "Hey, what's the weather?"Friend: *immediately responds* "It's sunny!"You: *wait on the line until friend responds*Characteristics:- Immediate response (or you wait)- Blocking (can't do anything else while waiting)- If friend doesn't answer = you're stuck ❌
Messaging (Asynchronous) = Text Message
You: *send text* "Hey, what's the weather?"You: *continue doing other things*Friend: *responds 10 minutes later* "It's sunny!"You: *see message when convenient*Characteristics:- Delayed response (you don't wait)- Non-blocking (you continue working)- If friend is offline = message waits in queue ✅
Frontend Application ↓ HTTP POST /ordersOrder API (waits for response...) ↓ HTTP POST /chargePayment API (processing...) ↓ (3 seconds later)Payment API → "Success!" ↓Order API → "Order created!" ↓Frontend → "Thank you for your order!"Total Time: 3+ seconds (frontend waits entire time) ⏱️If Payment API is down → Order API fails → Frontend shows error ❌
When to Use Synchronous:
✅ Need immediate response (user login)
✅ Short operations (<1 second)
✅ User is waiting for result
✅ Simple request/response
When NOT to Use:
❌ Long operations (video encoding, report generation)
Frontend Application ↓ HTTP POST /ordersOrder API ↓ Send message to queue "new-order"Order API → "Order received! We'll process it." ↓Frontend → "Order submitted! Check email for confirmation."Total Time: 50ms (frontend gets immediate response) ✅Meanwhile (in the background):Service Bus Queue: "new-order" ↓ (Worker picks up message)Payment Service (processing... 3 seconds) ↓ (Publishes event "payment-successful")Email Service (listening for events) ↓ Sends confirmation emailUser experience: Instant response + email arrives 5 seconds later ✅If Payment Service is down → Message stays in queue → Retries later ✅
When to Use Asynchronous:
✅ Long operations (>1 second)
✅ Don’t need immediate response
✅ Need reliability (retries, durability)
✅ High throughput (millions of operations)
✅ Decoupling services (payment fails ≠ order API fails)
User uploads 1 GB video ↓ HTTP POST /videos/uploadAPI receives video ↓ (Encodes video... 10 minutes) ↓ HTTP times out after 30 seconds ❌User sees error ❌
The Fix (Asynchronous):
User uploads 1 GB video ↓ HTTP POST /videos/uploadAPI receives video → Sends message to queue ↓ Returns immediately: "Video uploaded! Processing..."User sees: "We're processing your video" ✅Background worker: ↓ Picks up message from queue ↓ Encodes video (10 minutes) ↓ Sends notification when doneUser gets email: "Video is ready!" ✅
Cost Impact:
Synchronous: 90% of uploads fail (timeout) → Lost users
Customer orders food ↓Waiter writes ticket → Hangs on kitchen board ↓Chef takes ticket (one chef processes one ticket) ↓Chef makes food → Marks ticket completeCharacteristics:- Guaranteed delivery (ticket doesn't disappear)- FIFO order (first ticket processed first)- One chef processes each ticket- If chef is busy → Ticket waits- Critical for: Money transactions, orders
Use Service Bus When:
✅ Need guaranteed delivery (can’t lose messages)
✅ Need order preservation (FIFO)
✅ Critical business operations (orders, payments)
✅ Need transactional guarantees
Cost: 10/month+0.05 per million operationsAzure Event Hub = Security Camera Footage
100 cameras recording 24/7 ↓Millions of video frames per hour ↓Multiple systems watching footage: ├─ Security team (watches live) ├─ AI system (detects motion) └─ Archive system (stores for 90 days)Characteristics:- High throughput (millions of events/second)- Multiple consumers read same data- No guaranteed FIFO (partitioned)- Lossy (if consumer is slow, events can be skipped)- Critical for: Telemetry, logs, analytics
Use Event Hub When:
✅ Need high throughput (millions of events)
✅ Multiple consumers need same data
✅ Telemetry and logging
✅ OK to lose occasional event (not critical)
Cost: 11/month+0.028 per million eventsAzure Event Grid = Building Fire Alarm
Fire detected ↓Fire alarm triggers ↓Notifies multiple systems instantly: ├─ Fire department (sends trucks) ├─ Building management (evacuates) └─ HVAC system (shuts down)Characteristics:- Event-driven (reacts to events)- Pub/sub (multiple subscribers)- Serverless (no infrastructure)- Low latency (<1 second)- Critical for: Automation, serverless workflows
Use Event Grid When:
✅ React to Azure resource events (blob uploaded, VM created)
Scenario: IoT application with 1,000 devices sending data every secondOption 1: Service Bus (WRONG)
1,000 devices × 1 message/second = 1,000 msg/sec1,000 msg/sec × 86,400 sec/day = 86.4 million msg/day86.4 million × 30 days = 2.59 billion msg/monthCost:- Base: $10/month- Operations: 2,590 million × $0.05 / million = $129.50/monthTotal: $140/monthProblems:- Service Bus throttles at high throughput ❌- Not designed for streaming data ❌- Expensive for this use case ❌
Option 2: Event Hub (CORRECT)
Same volume: 2.59 billion events/monthCost:- Base: $11/month- Ingress: 2,590 million × $0.028 / million = $72.52/monthTotal: $83.52/month ✅Benefits:- Designed for high throughput ✅- 41% cheaper ($56.48 savings/month)- Better performance ✅
Lesson: Using wrong service costs 70% more + worse performance!
[!TIP]
Jargon Alert: API Gateway vs Service MeshAPI Gateway (like Azure API Management) sits at the edge and handles external traffic—rate limiting, authentication, versioning.
Service Mesh (like Istio) sits between microservices and handles internal traffic—retries, circuit breaking, observability.
[!WARNING]
Gotcha: Service Bus vs Event Hub Confusion
Choosing wrong can cost you! Service Bus = reliable messaging with FIFO guarantees (order processing). Event Hub = high-throughput streaming for telemetry. Using Service Bus for telemetry = expensive and slow. Using Event Hub for orders = lost data!
GET https://api.contoso.com/v1/productsGET https://api.contoso.com/v2/productsPros:✅ Clear and explicit✅ Easy to cache✅ Works with all clientsCons:❌ Pollutes URI space
GET https://api.contoso.com/products?api-version=1.0GET https://api.contoso.com/products?api-version=2.0Pros:✅ Keeps URIs clean✅ Easy to default to latestCons:❌ Hard to cache❌ Can be forgotten
GET https://api.contoso.com/productsHeader: Api-Version: 1.0GET https://api.contoso.com/productsHeader: Api-Version: 2.0Pros:✅ URIs stay consistent✅ Good for REST puristsCons:❌ Invisible in browser❌ Requires custom client code
GET https://api.contoso.com/productsAccept: application/vnd.contoso.v1+jsonGET https://api.contoso.com/productsAccept: application/vnd.contoso.v2+jsonPros:✅ REST-compliant✅ Supports multiple formatsCons:❌ Most complex❌ Rarely used in practice
Problem: How to maintain data consistency across microservices?Solution: Coordinate a sequence of local transactions with compensating actions.Implementation:
Store state as a sequence of events, not snapshots.
// Event storepublic class OrderEventStore{ private readonly EventHubProducerClient _eventHub; public async Task AppendEvent(OrderEvent @event) { var eventData = new EventData(JsonSerializer.Serialize(@event)) { // Partition by aggregate ID for ordering PartitionKey = @event.OrderId.ToString() }; await _eventHub.SendAsync(new[] { eventData }); }}// Rebuild state from eventspublic class Order{ public Guid Id { get; private set; } public OrderStatus Status { get; private set; } public decimal Total { get; private set; } public static Order FromEvents(IEnumerable<OrderEvent> events) { var order = new Order(); foreach (var @event in events) { order.Apply(@event); } return order; } private void Apply(OrderEvent @event) { switch (@event) { case OrderCreatedEvent e: Id = e.OrderId; Status = OrderStatus.Created; break; case OrderItemAddedEvent e: Total += e.Price * e.Quantity; break; case OrderCancelledEvent e: Status = OrderStatus.Cancelled; break; } }}// Query eventsvar events = await GetEventsFromEventHub(orderId);var order = Order.FromEvents(events);
Q4: Design a fault-tolerant message processing system
Answer:Architecture:
Service Bus Queue ├─ MaxDeliveryCount: 3 ├─ LockDuration: 5 minutes └─ Dead Letter QueueProcessing Logic:1. Receive message (lock acquired)2. Process with try/catch3. Success → Complete message4. Transient error → Abandon (retry)5. Permanent error → Dead letter6. Lock expires → Auto-abandonDead Letter Handling:- Separate processor monitors DLQ- Analyze failure reason- Fix data and resubmit OR alert humans