Service Layout Guidelines for Large Clusters

The following are guidelines about which services to separate on large clusters:

  • JobTracker and ResourceManager on ZooKeeper nodes: Avoid running the JobTracker and ResourceManager service on nodes that are running the ZooKeeper service. On large clusters, the JobTracker and ResourceManager services can consume significant resources.
  • MySQL on CLDB nodes: Avoid running the MySQL server that supports the MapR Metrics service on a CLDB node. Consider running the MySQL server on a machine external to the cluster to prevent the MySQL server’s resource needs from affecting services on the cluster.
  • TaskTracker on CLDB or ZooKeeper nodes: When the TaskTracker service is running on a node that is also running the CLDB or ZooKeeper services, consider reducing the number of task slots that this node's instance of the TaskTracker service provides.
  • Webserver on CLDB nodes: Avoid running the webserver on CLDB nodes. Queries to the MapR Metrics service can impose a bandwidth load that reduces CLDB performance.
  • JobTracker: Run the JobTracker services on dedicated nodes for clusters with over 250 nodes.
  • ResourceManager: Run the ResourceManager services on dedicated nodes for clusters with over 250 nodes.