Key Concepts of Kafka
Understand the fundamental building blocks: Topics, Partitions, Producers, Consumers, and Brokers
Kafka Components Overview
Apache Kafka is built around a few core concepts that work together to provide a scalable, fault-tolerant messaging system.

Topics - The Heart of Kafka
A Topic is a category or feed name to which messages are published. Topics are similar to tables in a database or folders in a filesystem. Each topic has a unique name within the Kafka cluster.
Key Characteristics of Topics
- Multi-subscriber: A topic can have zero, one, or many consumers that subscribe to the data written to it
- Retention: Data in topics is retained for a configurable period (time-based or size-based)
- Immutable: Once data is written to a topic, it cannot be changed (immutable append-only log)
- Partitioned: Topics are split into partitions for parallelism and scalability
Naming Convention
Use descriptive names like order-events, user-activity, or payment-transactions. Avoid special characters; use hyphens or underscores.
Partitions - The Unit of Parallelism
Partitions are the unit of parallelism in Kafka. Each topic is divided into one or more partitions, and each partition is an ordered, immutable sequence of messages that is continually appended to.
Topic: orders (3 partitions) Partition 0: [msg0] → [msg3] → [msg6] → [msg9] → ... ↑ ↑ offset=0 offset=3 Partition 1: [msg1] → [msg4] → [msg7] → [msg10] → ... ↑ ↑ offset=0 offset=3 Partition 2: [msg2] → [msg5] → [msg8] → [msg11] → ... ↑ ↑ offset=0 offset=3
Offset
Each message within a partition has a unique sequential ID called offset. Offsets are immutable and always increasing. Consumers track their position using offsets.
Ordering Guarantee
Kafka guarantees message ordering only within a partition, not across partitions. Use message keys to ensure related messages go to the same partition.
Partition Count is (Almost) Immutable
You can increase the number of partitions, but you cannot decrease them. Also, increasing partitions may break key-based ordering. Plan your partition count carefully!
How Many Partitions Should You Have?
- More partitions = More parallelism: Each partition can be consumed by only one consumer in a group
- Rule of thumb: Number of partitions ≥ Number of consumers you plan to have
- Don't over-partition: Each partition has overhead (file handles, memory, leader election time)
Producers - Writing Data to Kafka
Producers are applications that publish (write) messages to Kafka topics. They are responsible for choosing which partition within a topic to send the message to.
Message Structure
- • Key (optional): Used for partition routing
- • Value: The actual message content
- • Timestamp: When the message was produced
- • Headers (optional): Metadata key-value pairs
Partition Assignment
- • With Key: hash(key) % numPartitions
- • Without Key: Round-robin or sticky partitioning
- • Custom: Implement custom partitioner
@AutowiredprivateKafkaTemplate<String,Order> kafkaTemplate;publicvoidsendOrder(Order order){// Using orderId as key ensures all events for same order// go to the same partition (maintaining order)String key = order.getOrderId();
kafkaTemplate.send("orders", key, order).whenComplete((result, ex)->{if(ex ==null){RecordMetadata metadata = result.getRecordMetadata();
log.info("Sent to partition {} with offset {}",
metadata.partition(), metadata.offset());}});}Producer Acknowledgments (acks)
| acks | Description | Durability | Latency |
|---|---|---|---|
| 0 | Fire and forget | Low | Lowest |
| 1 | Leader acknowledgment | Medium | Medium |
| all / -1 | All in-sync replicas ack | Highest | Highest |
Consumers & Consumer Groups
Consumers read messages from topics. They are organized into Consumer Groups for load balancing and fault tolerance. This is one of Kafka's most powerful features!
Consumer Group Rules
- Each partition is consumed by exactly one consumer within a group
- A consumer can consume from multiple partitions
- If consumers > partitions, some consumers will be idle
- Multiple consumer groups can read from the same topic independently
Topic: orders (4 partitions: P0, P1, P2, P3) Consumer Group A (2 consumers): Consumer-1 → P0, P1 Consumer-2 → P2, P3 Consumer Group B (4 consumers): Consumer-1 → P0 Consumer-2 → P1 Consumer-3 → P2 Consumer-4 → P3 Consumer Group C (6 consumers): Consumer-1 → P0 Consumer-2 → P1 Consumer-3 → P2 Consumer-4 → P3 Consumer-5 → IDLE (no partition) Consumer-6 → IDLE (no partition)
@KafkaListener(
topics ="orders",
groupId ="order-processors",
concurrency ="3"// 3 consumer threads)publicvoidconsume(@PayloadOrder order,@Header(KafkaHeaders.RECEIVED_PARTITION)int partition,@Header(KafkaHeaders.OFFSET)long offset,@Header(KafkaHeaders.RECEIVED_TIMESTAMP)long timestamp){
log.info("Received order {} from partition {} at offset {}",
order.getOrderId(), partition, offset);processOrder(order);}Offset Management
Auto Commit
Kafka automatically commits offsets periodically. Simple but may cause duplicates or data loss on failures.
Manual Commit
Application controls when to commit. Provides exactly-once semantics with proper implementation.
Brokers & Clusters
A Broker is a Kafka server that stores data and serves clients. Multiple brokers form a Cluster. Each broker is identified by a unique numeric ID.
Broker Duties
- • Receive messages from producers
- • Store messages on disk
- • Serve consumer fetch requests
- • Replicate data to followers
Partition Leader
- • Each partition has one leader
- • All reads/writes go to leader
- • Leader replicates to followers
- • Auto-failover if leader dies
Replication
- • Data replicated across brokers
- • Replication factor configurable
- • ISR: In-Sync Replicas
- • Ensures fault tolerance
Kafka Cluster (3 Brokers, Replication Factor = 3) Topic: orders (3 Partitions) Broker 0: - Partition 0 (LEADER) - Partition 1 (Follower) - Partition 2 (Follower) Broker 1: - Partition 0 (Follower) - Partition 1 (LEADER) - Partition 2 (Follower) Broker 2: - Partition 0 (Follower) - Partition 1 (Follower) - Partition 2 (LEADER) If Broker 1 goes down: → Partition 1 leadership moves to Broker 0 or Broker 2
Bootstrap Servers
Clients only need to connect to one broker initially (bootstrap server). That broker provides metadata about all brokers in the cluster.
ZooKeeper and KRaft (Kafka Raft)
Kafka requires a coordination service to manage cluster metadata. Historically this was Apache ZooKeeper, but Kafka 3.x introduces KRaft (Kafka Raft) as a built-in replacement.
ZooKeeper (Legacy)
- • Stores cluster metadata
- • Tracks broker health
- • Elects partition leaders
- • Requires separate deployment
- • Being phased out in Kafka 4.0
KRaft (Modern)
- • Built into Kafka itself
- • Simpler deployment
- • Faster failover
- • Better scalability
- • Default in Kafka 3.3+