Understanding ResourceManager's High Availability
The ResourceManager is a per-cluster service in a YARN cluster. It manages the cluster resources and schedules the applications on the basis of resource availability. What if this one service goes down or the node running the services gets out of the network? The whole cluster would become unusable, as the only point of contact for the clients is unavailable. Also, the running applications would not be able to acquire the cluster resources for task execution or status updates.
The ResourceManager service is considered to be the single point of failure in a cluster. In Hadoop 2.4.1, this issue is resolved and the High Availability feature of the ResourceManager service is introduced in YARN.
Architecture
A cluster configured with High Availability of ResourceManager has multiple ResourceManager services running; only one of them is active at a time and the rest are in standby state. Clients always connect to the active ResourceManager service...