The formal definition of Apache Zookeeper says that it is a distributed, open-source configuration, synchronization service along with naming registry for distributed applications. Apache Zookeeper is used to manage and coordinate large cluster of machines.
Distributed applications are difficult to coordinate and work with as they are much more error prone (cause errors) due to huge number of machines attached to network. As many machines are involved, race condition and deadlocks are common problems when implementing distributed applications.
Race condition occurs when a machine tries to perform two or more operations at a time and this can be taken care by serialization property of Zookeeper.
Deadlocks are when two or more machines try to access same shared resource at the same time. More precisely they try to access each other’s resources which ]lead to lock of system as none of the system is releasing the resource but waiting for other system to release it. Synchronization in Zookeeper helps to solve the deadlock.
Another major issue with distributed application can be partial failure of process, which can lead to inconsistency of data. Zookeeper handles this through atomicity, which means either whole of the process will finish or nothing will persist after failure.
The Zookeeper framework was originally built at “Yahoo!” for accessing their applications in an easy and robust manner. Later, Apache Zookeeper became a standard for organized service used by Hadoop, HBase, and other distributed frameworks.
A Distributed Application can run on multiple systems in a network at a given time (simultaneously) by coordinating among them to complete a particular task in a fast and efficient manner. Normally, complex and time-consuming tasks, which will take hours to complete by a non-distributed application (running in a single system) can be done in minutes by a distributed application by using computing capabilities of all the system involved.
The time to complete the task can be further reduced by configuring the distributed application to run on more systems. A group of systems in which a distributed application is running is called a Cluster and each machine running in a cluster is called a Node.
A distributed application has two parts, Server and Client application. Server applications are actually distributed and have a common interface so that clients can connect to any server in the cluster and get the same result. Client applications are the tools to interact with a distributed application.
Apache Zookeeper is a service used by a cluster (group of nodes) to coordinate between themselves and maintain shared data with robust synchronization techniques. Zookeeper is itself a distributed application providing services for writing a distributed application. The common services provided by Zookeeper are as follows
Naming service: Identifying the nodes in a cluster by name. It is similar to DNS, but for nodes.
Configuration management: Latest and up-to-date configuration information of the system for a joining node.
Cluster management: Joining / leaving of a node in a cluster and node status at real time.
Leader election: Electing a node as leader for coordination purpose.
Locking and synchronization service: Locking the data while modifying it. This mechanism helps you in automatic fail recovery while connecting other distributed applications like Apache HBase.
Highly reliable data registry: Availability of data even when one or a few nodes are down.