Alarm Status View

The Alarm Status view displays the following information about nodes in the cluster:

  • Hlth - each node's health: healthy, degraded, critical, or maintenance
  • Hostname - DNS hostname for nodes in this cluster
  • Version Alarm - one or more services on the node are running an unexpected version
  • No Heartbeat Alarm - node is not undergoing maintenance, and no heartbeat is detected for over 5 minutes
  • UID Mismatch Alarm - services in the cluster are being run with different user names (UIDs)
  • Duplicate HostId Alarm - two or more nodes in the cluster have the same host id

Alarm Status Alerts

Too Many Containers Alarm The number of containers on this node reached the maximum limit
Excess Logs Alarm Debug logging is enabled on the node (debug logging generates enormous amounts of data and can fill up disk space)
Disk Failure Alarm A disk has failed on the node
Time Skew Alarm The clock on the node is out of sync with the master CLDB by more than 20 seconds
Root Partition Full Alarm The root partition ("/") on the node is running out of space (99% full)
Installation Directory Full Alarm The partition /opt/mapr on the node is running out of space (95% full)
Core Present Alarm A service on the node has crashed and created a core dump file
High FileServer Memory Alarm The memory consumed by fileserver service on the node is high
Pam Misconfigured Alarm The PAM authentication on the node is configured incorrectly
TaskTracker Local Directory Full Alarm The local directory used by the TaskTracker on the specified node(s) is full, and the TaskTracker cannot operate as a result
CLDB Alarm The CLDB service on the node has stopped running
FileServer Alarm The FileServer service on the node has stopped running
JobTracker Alarm The JobTracker service on the node has stopped running
TaskTracker Alarm The TaskTracker service on the node has stopped running
HBase Master Alarm The HBase Master service on the node has stopped running
HBase RegionServer Alarm The HBase RegionServer service on the node has stopped running
NFS Gateway Alarm The NFS service on the node has stopped running
WebServer Alarm The WebServer service on the node has stopped running
HostStats Alarm The HostStats service has stopped running
Metrics write problem Alarm The metric data was not written to the database