Cluster Heatmap Pane
The Cluster Heatmap pane displays the health of the nodes in the cluster, by rack. Each node appears as a colored square to show its health at a glance.
If you click on the small wrench icon at the upper right of the Cluster Heatmap pane, a key to the color-coded heatmap display slides into view. At the top of the display, you can set the refresh rate for the display (measured in seconds), as well as the number of columns to display (for example, 20 nodes are displayed across two rows for a 10-column display). Click the wrench icon again to slide the display back out of view.
The left drop-down menu at the top of the pane lets you choose which data is displayed. Some of the choices are shown below.
Heatmap legend by category
The heatmap legend changes depending on the criteria you select from the drop-down menu. All the criteria and their corresponding legends are shown here.
Health
- Healthy - all services up, MapR-FS and all disks OK, and normal heartbeat
- Upgrading - upgrade in process
- Degraded - one or more services down, or no heartbeat for over 1 minute
- Maintenance - routine maintenance in process
- Critical - Mapr-FS Inactive/Dead/Replicate, or no heartbeat for over 5 minutes
The following table shows the legend for all Heatmap displays, such as CPU, memory and disk space.
Legend | CPU Utilization | Memory Utilization | Disk Space Utilization |
---|---|---|---|
CPU < 50% | Memory < 50% | Used < 50% | |
CPU < 80% | Memory < 80% | Used < 80% | |
CPU >= 80% | Memory >= 80% | Used >= 80% | |
Unknown | Unknown | Unknown |
Alarms
The following table shows the alarms.
Too Many Containers Alarm | Containers within limit | Containers exceeded limit |
Duplicate HostId Alarm | No duplicate host id detected | Duplicate host id detected |
UID Mismatch Alarm | No UID mismatch detected | UID mismatch detected |
No Heartbeat Detected Alarm | Node heartbeat detected | Node heartbeat not detected |
TaskTracker Local Dir Full Alarm | TaskTracker local directory is not full | TaskTracker local directory full |
PAM Misconfigured Alarm | PAM configured | PAM misconfigured |
High FileServer Memory Alarm | Fileserver memory OK | Fileserver memory high |
Cores Present Alarm | No core files | Core files present |
Installation Directory Full Alarm | Installation Directory free | Installation Directory full |
Metrics Write Problem Alarm | Metrics writing to Database | Metrics unable to write to Database |
Root Partition Full Alarm | Root partition free | Root partition full |
HostStats Down Alarm | HostStats running | HostStats down |
Webserver Down Alarm | Webserver running | Webserver down |
NFS Gateway Down Alarm | NFS Gateway running | NFS Gateway down |
HBase RegionServer Down Alarm | HBase RegionServer running | HBase RegionServer down |
HBase Master Down Alarm | HBase Master running | HBase Master down |
TaskTracker Down Alarm | TaskTracker running | TaskTracker down |
JobTracker Down Alarm | JobTracker running | JobTracker down |
FileServer Down Alarm | FileServer running | FileServer down |
CLDB Down Alarm | CLDB running | CLDB down |
Time Skew Alarm | Time OK | Time skew alarm(s) |
Software Installation & Upgrades Alarm | Version OK | Version alarm(s) |
Disk Failure(s) Alarm | Disks OK | Disk alarm(s) |
Excessive Logging Alarm | No debug | Debugging |
Zoomed view
You can see a zoomed view of all the nodes in the cluster by moving the zoom slide bar. The zoomed display reveals more details about each node, based on the criteria you chose from the drop-down menu. In this example, CPU Utilization is displayed for each node.
Clicking a rack name navigates to the Nodes view, which provides more detailed information about the nodes in the rack.
Clicking a colored square navigates to the Node Properties View, which provides detailed information about the node.