Using HPE Ezmeral Data Fabric Monitoring (Spyglass Initiative)

HPE Ezmeral Data Fabric Monitoring (part of the Spyglass initiative) provides the ability to collect, store, and view metrics and logs for nodes, services, and jobs/applications.

Metric Monitoring

Administrators can monitor the current status of the cluster and anticipate future cluster requirements with dashboards. For example, you can use metrics dashboards to visualize the following:
Storage Utilization
Use metrics dashboards to monitor storage trends. For example, you can compare the volume of file system usage at different times to the file system capacity and then allocate resources to the file system accordingly.
Node Utilization
Use metrics dashboards to check for node overload. For example, if the CPU usage is high on a few nodes, you may want to distribute the load across more nodes for better performance and efficiency.
HPE Ezmeral Data Fabric Database Operational Trends
Use metrics dashboards to display historical trends for HPE Ezmeral Data Fabric Database operations. For example, if a user reports HPE Ezmeral Data Fabric Database slowness, the historical trends associated with row scans, get, and put operations can be used to identify the node(s) on which the performance degradation occurs.

Log Monitoring

Administrators can use dashboards to visualize, search, and review logs when troubleshooting issues. For example, you can use log dashboards to troubleshoot the following issues:
Service Failures
When metrics indicate that one or more services are down, use log dashboards to check the logs for each failed service and drill-down to each associated node.
Application Failures
When an application or job fails, use log dashboard to identify possible bottlenecks. For example, you can search the logs for a given application ID across all the nodes in the cluster.
file system Performance
When users experience file system or NFS for the HPE Ezmeral Data Fabric slowness, use log dashboards to search the HPE Ezmeral Data Fabric file system logs for service errors or application issues.

Related Information