> ## Documentation Index > Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt > Use this file to discover all available pages before exploring further. # Kafka Crash Course > Master event streaming - the backbone of real-time data pipelines Kafka Architecture Overview

# Kafka Crash Course > **"Kafka is the central nervous system for real-time data."** - Jay Kreps, Co-creator of Kafka When LinkedIn needed to handle billions of events per day, traditional message queues collapsed under the load. So they built Kafka -- a distributed commit log designed from the ground up for high-throughput, fault-tolerant event streaming. Today, it powers the real-time pipelines at Netflix, Uber, LinkedIn, and thousands of other companies. This course takes you from event-curious to production-ready, covering everything from the fundamentals of topics and partitions to the operational reality of running Kafka clusters at scale. *** ## Why Kafka Matters Millions of messages per second Horizontal scaling, distributed by design Messages persist on disk, replicated Process streams in real-time *** ## The Story Behind Kafka **2011**: LinkedIn created Kafka to handle their massive data pipeline needs. **The Problem**: * Traditional messaging couldn't handle LinkedIn's scale * Needed to process billions of events daily * Real-time analytics requirements * Data integration across systems **The Solution**: Apache Kafka * Distributed, partitioned, replicated log * High throughput (millions of messages/sec) * Horizontal scalability * Fault-tolerant and durable **Today**: Kafka powers: * **LinkedIn**: 7+ trillion messages/day * **Netflix**: Real-time recommendations * **Uber**: Trip data and analytics * **Airbnb**: Payment processing * **Twitter**: Real-time analytics **Open Sourced**: 2011, became Apache project *** ## What You'll Learn Topics, partitions, brokers, producers, consumers. The distributed log abstraction that makes Kafka special. [Start Here](/courses/devops-tools/kafka-fundamentals) Log segments, ISR mechanics, leader election, consumer rebalancing. If you love understanding how things actually work, this one is for you. [Explore Internals](/courses/devops-tools/kafka-internals) Publishing messages, consuming streams, serialization, idempotency. The APIs you will use every day. [Learn APIs](/courses/devops-tools/kafka-producers-consumers) Kafka Streams API, transformations, aggregations, joins. Real-time processing without Spark or Flink. [Process Streams](/courses/devops-tools/kafka-streams) Clustering, replication tuning, monitoring, capacity planning. Running Kafka in production. [Run in Production](/courses/devops-tools/kafka-operations) Kafka Connect, Schema Registry, ksqlDB. The tools that make Kafka a complete platform. [Explore Ecosystem](/courses/devops-tools/kafka-ecosystem) *** ## Kafka vs RabbitMQ The most common question in system design interviews: "When would you choose Kafka over RabbitMQ?" The short answer: Kafka is for events (things that happened), RabbitMQ is for commands (things you want to happen). If you need to replay last week's events for a new consumer, Kafka. If you need to distribute tasks to workers and ensure each task is processed exactly once, RabbitMQ. | Feature | Kafka | RabbitMQ | | --------------------- | ----------------------------------- | --------------------------------- | | **Use Case** | Event streaming, logs, CDC | Task queues, RPC, complex routing | | **Throughput** | Very high (millions/sec) | High (thousands/sec) | | **Message Retention** | Configurable (days/weeks/forever) | Until consumed and acknowledged | | **Ordering** | Per partition | Per queue | | **Consumers** | Pull model (consumer controls pace) | Push model (broker controls pace) | *** ## Course Structure ### Module 1: Fundamentals (2-3 hours) The distributed commit log, topics, partitions, brokers, offsets. Understanding why Kafka is different from traditional message queues. ### Module 2: Internals Deep Dive (2-3 hours) Log segments and indexes, ISR and replication mechanics, leader election, consumer group coordination, ZooKeeper vs KRaft. **If you love internals, continue. If not, skip to Module 3.** ### Module 3: Producers and Consumers (2-3 hours) Producer batching and compression, consumer groups and rebalancing, exactly-once semantics, offset management. ### Module 4: Stream Processing (2 hours) Kafka Streams API, stateless transformations, stateful processing, windowing, joins. Stream processing without the complexity of Spark. ### Module 5: Operations (2 hours) Cluster sizing, replication factor tuning, monitoring with JMX, capacity planning, performance tuning. ### Module 6: Ecosystem (1-2 hours) Kafka Connect for data integration, Schema Registry for schema evolution, ksqlDB for SQL-based stream processing. *** Ready to master Kafka? Start with [Kafka Fundamentals](/courses/devops-tools/kafka-fundamentals) or jump to [Internals Deep Dive](/courses/devops-tools/kafka-internals) if you want to understand the distributed log that powers trillions of events per day.