Cluster-Wide Blacklisting

A TaskTracker can be blacklisted cluster-wide for any of the following reasons:

  • The number of blacklists from successful jobs (the fault count) exceeds mapred.max.tracker.blacklists.
    NOTE:

    The parameter mapred.job.impact.blacklisting in the mapred-site.xml file lets you specify whether job failures should count toward the threshold set with mapred.max.tracker.blacklists. This parameter can be helpful when you are testing and know that jobs are likely to fail.

  • The TaskTracker has been manually blacklisted using hadoop job -blacklist-tracker <host>
  • The status of the TaskTracker (as reported by a user-provided health-check script) is not healthy

If a TaskTracker is blacklisted, any currently running tasks are allowed to finish, but no further tasks are scheduled. If a TaskTracker has been blacklisted due to mapred.max.tracker.blacklists or using the hadoop job -blacklist-tracker <host> command, un-blacklisting requires a TaskTracker restart.

Only 50% of the TaskTrackers in a cluster can be blacklisted at any one time.

After 24 hours, the TaskTracker is automatically removed from the blacklist and can accept jobs again.