A distributed system is a model in which components located on networked computers communicate and coordinate their actions by passing messages.

Some important points:

  1. Distributed systems are the systems which are designed in such a way that it distributes the load within the system and process the load simultaneously.
  2. To acheive simultaneuos process the load needs to be distribbuted across the cluster and there needs to be coordination mechanism meaning each every system in the cluster needs to talk to each other.
  3. In the world of distributed systems there are acheived using protocol called gossip protocol.
  4. There need to be aystem in place in order to monitor the heath and metadata information about the brokers that’s when zookeeper comes in to picture

Role of Zookeeper

  1. Zookeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization and providing group services.
  2. Zookeeper is responsible for maintaining cluster metadata, overall health of the kafka cluster and balancing the broker assignments and reassignmentsFor example,
  • If any new broker gets added to the cluster then how this new broker can participate in taking up the load
  • If an existing broker in the cluster goes down then the load should not be distributed to this particular broker