Kafka Architecture and End-to-End Semantics
You are asked to explain Kafka's core architecture and how to design for reliability and throughput in a production system.
1) Core Architecture Concepts
-
Topics and partitions
-
Leaders and followers
-
Replication factor (RF)
-
In-sync replicas (ISR)
2) Producers
-
How producers choose partitions (keyed vs unkeyed; sticky partitioner)
-
Acknowledgment (acks) options and when to use them
3) Consumers and Consumer Groups
-
How consumer groups commit offsets
-
Rebalance triggers, protocols, and mitigation strategies
4) Delivery Guarantees and Ordering
-
Ordering scope
-
Idempotent producer
-
Exactly-once processing (transactions)
5) Performance and Stability
-
Backpressure strategies
-
Batch sizing and linger
-
Compression choices
-
Throughput tuning knobs
6) Designing End-to-End Semantics
-
At-least-once pipeline design
-
Exactly-once pipeline design (Kafka-to-Kafka, and with external sinks)