JobTracker Failover

If the node running the JobTracker fails and the Warden on the JobTracker node is unable to restart it, Warden starts a new instance of the JobTracker. The Warden on every JobTracker node watches the JobTracker’s znode for changes. When the active JobTracker’s znode is deleted, the Warden daemons on other JobTracker nodes attempt to launch the JobTracker. The Warden service on each JobTracker node works with the Zookeeper to ensure that only one JobTracker is running in the cluster.

In order for failover to occur, at least two nodes in the cluster should include the JobTracker role. No further configuration is required.