Monitoring Cluster Health

You can check general cluster health by issuing the node heatmap command from the CLI, or by logging in to the MapR Control System (MCS). To check general cluster health from the MCS, expand the Cluster view. Click on Dashboard to display a summary of information about the cluster.

The dashboard display shows the Cluster Heatmap, which displays the health of the nodes in the cluster, organized by rack. To see the meaning of each heatmap symbol, click on the wrench symbol and the following display appears:

If you examine the first three racks in the following cluster heatmap, you can see that three nodes on rack00 are undergoing maintenance, as indicated by the symbol. On rack01, six nodes are degraded ; and on rack02, all 20 nodes are critical .

On the right side of this display, you can also see a Cluster Utilization summary, MapReduce statistics, the names of services and their status, and the number of mounted and unmounted volumes. At the lower left corner of the display, you can see a list of alarms. For a full explanation of alarms that can be raised on a cluster, see the Alarms Reference.