Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

Kafka Fundamentals

Master the core concepts of Apache Kafka event streaming and understand its distributed architecture.

What is Kafka?

Apache Kafka is a distributed event streaming platform. Unlike traditional message queues (like RabbitMQ), Kafka is designed to handle massive streams of events, store them durably, and process them in real-time. Think of the difference this way: a traditional message queue is like a postal service — once the letter is delivered, it is gone. Kafka is like a newspaper archive — events are published, anyone can read them (even people who subscribe later), and they stay around for as long as you configure.

Event Streaming

Continuous flow of data (events) as they happen

Distributed

Runs as a cluster of servers (brokers)

Durable

Stores events on disk for a configurable retention period

Scalable

Handles trillions of events per day

Core Architecture

1. Events (Messages)

An event records the fact that “something happened”.
  • Key: Optional, used for partitioning (e.g., user_123)
  • Value: The data payload (e.g., JSON {"action": "login"})
  • Timestamp: When it happened
  • Headers: Optional metadata

2. Topics

A Topic is a logical category or feed name to which records are published. Think of it as a named channel on a TV network — producers broadcast to a channel, and any number of viewers (consumers) can tune in independently.
  • Analogous to a table in a database or a folder in a filesystem.
  • Multi-subscriber: Can have zero, one, or many consumers. Each consumer reads independently at its own pace.
  • Append-only: New events are always added to the end. You cannot insert, update, or delete individual events (log compaction is the closest thing to an “update”).

3. Partitions

Topics are split into Partitions.
  • Scalability: Partitions allow a topic to be spread across multiple servers.
  • Ordering: Order is guaranteed only within a partition, not across the entire topic.
  • Offset: Each message in a partition has a unique ID called an offset.

4. Brokers

A Kafka server is called a Broker.
  • Receives messages from producers.
  • Assigns offsets.
  • Commits messages to disk storage.
  • Serves fetch requests from consumers.
  • A Cluster consists of multiple brokers working together.

5. Replication

Kafka replicates partitions across multiple brokers for fault tolerance. This is Kafka’s insurance policy — if a hard drive or server dies, your data survives because copies exist on other machines.
  • Replication Factor: Number of copies (usually 3). RF=3 means the data exists on 3 different brokers. You can lose any one broker and still have all your data.
  • Leader: One broker is the leader for a partition; handles all reads/writes. This simplifies consistency — there is one source of truth.
  • Followers: Replicate data from the leader. If the leader fails, a follower is promoted to the new leader within seconds, and producers/consumers reconnect automatically.

Producers & Consumers

Producers

Applications that publish (write) events to Kafka topics.
  • Partitioning Strategy: Decides which partition a message goes to.
    • Round-robin: If no key is provided (load balancing).
    • Hash-based: If key is provided (same key always goes to same partition).

Consumers

Applications that subscribe to (read) events from Kafka topics.
  • Consumer Groups: A set of consumers working together to consume a topic.
    • Each partition is consumed by only one consumer in the group.
    • Allows parallel processing of a topic.
  • Offsets: Consumers track their progress by committing offsets.

Kafka vs RabbitMQ (Deep Dive)

FeatureApache KafkaRabbitMQ
DesignDistributed Commit LogTraditional Message Broker
Message RetentionPolicy-based (e.g., 7 days), durableDeleted after consumption (usually)
ThroughputExtremely High (Millions/sec)High (Thousands/sec)
OrderingGuaranteed per partitionGuaranteed per queue
ConsumptionPull-based (Consumer polls)Push-based (Broker pushes)
Use CaseEvent streaming, Log aggregation, AnalyticsComplex routing, Task queues

The Power of Pull-Based Consumption

Kafka’s pull-based model is a key architectural decision that distinguishes it from traditional push-based messaging systems.
Consumers fetch messages at their own pace. Fast consumers aren’t held back by slow ones, and slow consumers aren’t overwhelmed by a flood of messages (backpressure is inherent).
Consumers can pull large batches of messages in a single request, significantly reducing network overhead and improving throughput (IOPS).
Since the broker doesn’t track “who read what” (consumers track their own offsets), consumers can easily rewind to an old offset and re-process past events. This is crucial for:
  • Recovering from errors
  • Testing new processing logic on old data
  • Training ML models

ZooKeeper vs KRaft (Critical for Interviews!)

Kafka has traditionally relied on ZooKeeper for cluster coordination. KRaft (Kafka Raft) is the new consensus protocol that removes this dependency.

ZooKeeper Mode (Legacy)

ZooKeeper responsibilities:
  • Broker registration and discovery
  • Controller election
  • Topic configuration storage
  • ACLs and quotas

KRaft Mode (New Standard - Kafka 3.3+)

AspectZooKeeperKRaft
ArchitectureSeparate clusterIntegrated with Kafka
LatencyHigher (two systems)Lower (single system)
ScalabilityLimited by ZKMillions of partitions
Operational ComplexityTwo systems to manageSingle system
Production ReadyYesYes (Kafka 3.6+)
Interview Tip: Know that KRaft is the future and understand why Kafka moved away from ZooKeeper (simpler operations, better scalability, lower latency).

Partition Internals Deep Dive

Understanding partition internals is crucial for performance tuning and interviews.

Log Segments

Each partition is stored as a series of log segments:
partition-0/
├── 00000000000000000000.log    # Segment files
├── 00000000000000000000.index  # Offset index
├── 00000000000000000000.timeindex  # Time index
├── 00000000000005000000.log    # Next segment (starts at offset 5000000)
└── ...

How Writes Work

  1. Producer sends message to partition leader
  2. Leader appends to active segment file (sequential I/O - very fast!)
  3. Leader replicates to followers (if acks=all)
  4. Leader responds to producer with offset

How Reads Work

  1. Consumer requests offset range
  2. Broker uses .index file to find segment
  3. Broker does sequential read from segment
  4. Returns batch of messages
Interview Insight: Kafka achieves high throughput through:
  • Sequential I/O (no random seeks)
  • OS page cache (zero-copy)
  • Batching and compression

In-Sync Replicas (ISR) Deep Dive

ISR is one of the most important concepts for durability.

What is ISR?

The set of replicas that are “caught up” with the leader. A replica is removed from ISR if:
  • It falls behind by more than replica.lag.time.max.ms (default 30s)
  • It’s disconnected from ZooKeeper/controllers

ISR Configuration

ConfigDescriptionDefault
min.insync.replicasMinimum ISRs for write to succeed1
replica.lag.time.max.msMax lag before removal from ISR30000

Data Loss Scenarios

Scenario: Producer gets ack after leader writes, but leader crashes before replicating. Result: Data loss when new leader is elected. Fix: Use acks=all
Scenario: All in-sync replicas fail at once. Result: Either wait for ISR to recover, or allow non-ISR leader (data loss). Config: unclean.leader.election.enable=false (default)
Scenario: Two replicas, one fails, producer still writes with acks=all. Result: Only one copy exists. If it fails, data is lost. Fix: RF=3, min.insync.replicas=2

The Golden Rule

Replication Factor = 3
min.insync.replicas = 2
acks = all
This ensures: 2 replicas must acknowledge, can survive 1 broker failure without data loss.

Interview Questions & Answers

  1. Sequential I/O: Append-only writes, no random disk seeks
  2. OS Page Cache: Uses filesystem cache for zero-copy reads
  3. Batching: Groups messages to reduce network/disk overhead
  4. Compression: Reduces network bandwidth
  5. Partitioning: Parallelism across brokers
  1. Controller detects broker failure (via heartbeats)
  2. For each partition where failed broker was leader:
    • Controller elects new leader from ISR
    • Updates metadata in all brokers
  3. Producers/consumers get metadata update and reconnect
  4. Data is safe if replica was in ISR
Formula: partitions = max(throughput/producer_throughput, throughput/consumer_throughput)Considerations:
  • More partitions = more parallelism
  • More partitions = more memory/file handles
  • Can’t decrease partitions (only increase)
  • Each partition = at most one consumer in a group
Rule of thumb: Start with 3-6 partitions per topic, scale as needed.
  • Topic: Logical category/feed (like a table)
  • Partition: Physical subdivision of a topic (like a shard)
A topic is split into partitions for:
  • Scalability (spread across brokers)
  • Parallelism (multiple consumers)
  • Ordering (guaranteed within partition)
  1. Consumer stops sending heartbeats
  2. After session.timeout.ms, consumer is considered dead
  3. Rebalance is triggered in the consumer group
  4. Partitions are reassigned to remaining consumers
  5. New consumer starts from last committed offset
One broker is elected as the Controller:
  • Monitors broker liveness
  • Elects partition leaders when brokers fail
  • Updates cluster metadata
  • Manages partition reassignments
In KRaft mode, there’s a quorum of controllers using Raft consensus.

Common Pitfalls

1. Too Few Partitions: Can’t increase consumer parallelism. Start with enough partitions.2. acks=1 in Production: Risk of data loss. Always use acks=all for critical data.3. Single Consumer for High Volume: One consumer can only handle one partition. Use consumer groups.4. Ignoring Consumer Lag: Lag indicates consumers are too slow. Monitor and alert on lag.5. Auto-Create Topics in Production: Set auto.create.topics.enable=false. Typos create unwanted topics.

Installation & Quick Start


CLI Power User Commands

Topic Management

# Create a topic with 3 partitions and replication factor of 1.
# NOTE: You cannot decrease partition count after creation. Plan ahead.
# RF=1 is for local dev only. Production minimum is RF=3.
bin/kafka-topics.sh --create \
    --topic user-events \
    --bootstrap-server localhost:9092 \
    --partitions 3 \
    --replication-factor 1

# List all topics in the cluster
bin/kafka-topics.sh --list --bootstrap-server localhost:9092

# Describe topic details: shows partition count, leader broker, replicas, and ISR.
# This is your first debugging command when something seems wrong with a topic.
bin/kafka-topics.sh --describe \
    --topic user-events \
    --bootstrap-server localhost:9092

# Delete topic. WARNING: this deletes ALL data in the topic permanently.
# Requires delete.topic.enable=true on the broker (default: true since Kafka 1.0).
bin/kafka-topics.sh --delete \
    --topic user-events \
    --bootstrap-server localhost:9092

Producing & Consuming

# Console Producer (Type messages and hit Enter)
bin/kafka-console-producer.sh \
    --topic user-events \
    --bootstrap-server localhost:9092 \
    --property "parse.key=true" \
    --property "key.separator=:"

# Example input:
# user1:login
# user2:logout

# Console Consumer (Read from beginning)
bin/kafka-console-consumer.sh \
    --topic user-events \
    --from-beginning \
    --bootstrap-server localhost:9092 \
    --property "print.key=true"

Consumer Groups

# List consumer groups
bin/kafka-consumer-groups.sh --list --bootstrap-server localhost:9092

# Describe group (See lag, current offset)
bin/kafka-consumer-groups.sh --describe \
    --group my-group \
    --bootstrap-server localhost:9092

Key Takeaways

  • Topics are logs of events, divided into Partitions.
  • Partitions allow Kafka to scale and guarantee ordering.
  • Brokers form a cluster to provide durability and availability.
  • Consumer Groups allow parallel processing of topics.
  • Kafka is pull-based and stores data for a set retention period.

Interview Deep-Dive

Strong Answer:
  • The partition key determines which messages land in the same partition, and the number of partitions determines the maximum parallelism for consumers.
  • For an order processing system, I would use customer_id as the partition key. This guarantees that all orders for a given customer are processed in order (ordering is guaranteed within a partition). If I used order_id, orders for the same customer could land in different partitions and be processed out of order — a customer might see “order shipped” before “order confirmed.”
  • For the number of partitions, I work backwards from throughput requirements. If the system needs to process 10,000 orders/second and each consumer instance can handle 1,000 orders/second, I need at least 10 partitions (one consumer per partition in a consumer group). I would start with 12-16 partitions to leave room for growth, because you can increase partitions later but never decrease them, and adding partitions changes the key-to-partition mapping (existing keys may land in different partitions).
  • Gotchas to consider: if the customer base has a heavy skew (one enterprise customer generates 50% of orders), that customer’s partition becomes a hot partition. The fix is either a compound key (customer_id + order_date) to spread the load, or a custom partitioner that sub-partitions high-volume customers.
  • I would also set the replication factor to 3 and min.insync.replicas to 2, ensuring that a single broker failure does not cause data loss or availability issues.
Follow-up: The product team says they need global ordering across all orders, not just per-customer ordering. Can Kafka provide this?Global ordering in Kafka requires a single partition, because ordering is only guaranteed within a partition. But a single partition means a single consumer — no parallelism. At 10,000 orders/second, one consumer is a bottleneck. The real question is whether global ordering is truly required or whether per-entity ordering (per customer, per merchant) is sufficient. In my experience, when product teams say “global ordering,” they usually mean “I do not want order #2 processed before order #1 for the same customer” — which is per-key ordering, solved by the partition key. If they genuinely need global ordering (e.g., for a financial ledger), Kafka with a single partition works at moderate throughput. For high throughput with global ordering, consider an event sourcing database (e.g., EventStoreDB) that is specifically designed for this pattern.
Strong Answer:
  • Step 1: The controller detects the failure. In ZooKeeper mode, the broker’s ZK session expires (default 18 seconds). In KRaft mode, the controller quorum detects missing heartbeats. This detection latency is the first component of recovery time.
  • Step 2: For every partition where the failed broker was the leader, the controller selects a new leader from the ISR (In-Sync Replicas) set. The selection is deterministic — typically the first broker in the ISR list. If the ISR is empty (all replicas failed), the controller either waits for an ISR member to recover or, if unclean.leader.election.enable=true, promotes a non-ISR replica (risking data loss).
  • Step 3: The controller updates the cluster metadata with the new leader assignments and broadcasts this to all remaining brokers.
  • Step 4: Producers and consumers receive metadata updates (either through polling or through the NotLeaderForPartitionException error that triggers a metadata refresh). They reconnect to the new leaders. This typically takes 1-5 seconds.
  • Step 5: For partitions where the failed broker was a follower, the remaining ISR continues to serve reads and writes from the leader. There is no immediate impact — just reduced redundancy.
  • Step 6: When the failed broker restarts, it rejoins the cluster, begins fetching data from current leaders to catch up, and is re-added to ISR once it is caught up (within replica.lag.time.max.ms).
  • Total downtime for affected partitions: typically 5-30 seconds, dominated by the failure detection latency. With KRaft, this is faster because there is no ZooKeeper session timeout.
Follow-up: During the leader election window, what happens to in-flight producer requests with acks=all?Producers with acks=all will receive a NotLeaderForPartitionException (or LeaderNotAvailableException during the election window). With retries enabled (and retries should be set to Integer.MAX_VALUE in production), the producer will retry the request. The retry triggers a metadata refresh, which discovers the new leader, and the message is sent to the new leader. No data is lost — the message is either committed to the old leader’s ISR before failure (in which case the new leader has it) or retried to the new leader. This is why acks=all + retries + idempotent producer is the gold standard for durability.
Strong Answer:
  • ZooKeeper was Kafka’s external dependency for cluster coordination: broker registration, controller election, topic metadata storage, and ACLs. It worked, but it introduced three categories of pain.
  • First, operational complexity. Running Kafka in production meant managing two distributed systems — the Kafka cluster and the ZooKeeper ensemble. Each has its own deployment, monitoring, backup, and upgrade procedures. ZooKeeper has its own configuration, its own failure modes, and its own capacity limits. For small teams, this doubled the operational burden.
  • Second, scalability limits. ZooKeeper stores all metadata in memory and uses a consensus protocol (ZAB) that becomes a bottleneck at scale. LinkedIn hit limits around 200,000 partitions because ZooKeeper could not handle the metadata volume. Partition creation and deletion became slow because they required synchronous writes to ZooKeeper.
  • Third, latency. Controller election and metadata updates required round-trips to ZooKeeper. When a broker failed, the detection and leader election process involved ZooKeeper session timeouts (default 18 seconds) plus the metadata update propagation. This meant partitions could be unavailable for 20-30 seconds during a broker failure.
  • KRaft solves all three. It embeds the consensus protocol (Raft) directly into Kafka. Metadata is stored in a Kafka internal topic (__cluster_metadata), managed by a quorum of controller nodes. No external ZooKeeper. This simplifies operations (one system), enables millions of partitions (metadata scales with Kafka itself), and reduces failover latency (Raft leader election is sub-second).
Follow-up: If you are running a Kafka cluster on ZooKeeper today, how do you plan the migration to KRaft?Kafka provides a migration tool (kafka-metadata.sh) that performs the transition online. The process: first, deploy KRaft controller nodes alongside the existing ZooKeeper-based cluster. Second, use the migration tool to replicate metadata from ZooKeeper to the KRaft controllers. Third, reconfigure brokers to use KRaft controllers instead of ZooKeeper (rolling restart). Fourth, once all brokers are on KRaft, decommission the ZooKeeper ensemble. The migration is designed to be zero-downtime, but I would schedule it during a low-traffic window and have a rollback plan. The key risk is the “point of no return” — once brokers are on KRaft and you decommission ZooKeeper, you cannot easily go back.

Next: Kafka Producers & Consumers →