Let’s describe each component of Kafka Architecture shown in the following diagram
1. Kafka Broker
A Kafka cluster is made up of multiple Kafka Brokers. A broker is a Kafka server. As the name suggests, the producer and consumer don’t interact directly but use the Kafka server as an agent or broker to exchange message services. Kafka cluster typically consists of multiple brokers to maintain load balance. Unlike other message systems, Kafka brokers are stateless. So they use ZooKeeper for maintaining their cluster state. One Kafka broker instance can handle hundreds of thousands of reads and writes per second and each broker can handle TB of messages without performance impact. Kafka broker leader election can be done by ZooKeeper.
For the purpose of managing and coordinating, Kafka broker uses ZooKeeper. It is primarily used to notify producers and consumers about the presence of a new broker or about the failure of a broker in the Kafka system. As soon as Zookeeper send the notification regarding presence or failure of the broker then producer and consumer, take the decision and starts coordinating their task with some other broker.
Apache Zookeeper is a distributed configuration and synchronization service, which serves as the coordination interface between the Kafka brokers and consumers. Since all the critical information is stored in Zookeeper and it normally replicates this data across its ensemble, failure of a Kafka broker or Zookeeper does not affect the state of the Kafka cluster.
3. Kafka Producer
Kafka Producers push data to brokers. When the new broker is started, all the producers search it and automatically sends a message to that new broker. Kafka producer doesn’t wait for acknowledgements from the broker and sends messages as fast as the broker can handle.
Basically, by using partition offset the Kafka Consumer maintains that how many messages have been consumed because Kafka brokers are stateless. If the consumer acknowledges a particular message offset, it implies that the consumer has consumed all prior messages. The consumer issues an asynchronous pull request to the broker to have a buffer of bytes ready to consume. The consumers can rewind or skip to any point in a partition simply by supplying an offset value. Consumer offset value is notified by ZooKeeper.