Analyzing Job Metrics

The MapR Metrics service collects and displays analytics information about the Hadoop jobs, tasks, and task attempts that run on the nodes in your cluster. You can use this information to examine specific aspects of your cluster's performance at a very granular level, enabling you to monitor how your cluster responds to changing workloads and optimize your Hadoop jobs or cluster configuration. The analytics information collected by the MapR Metrics service is stored in a MySQL database. The server running MySQL does not have to be a node in the cluster, but the nodes in your cluster must have access to the server.

The MapR Control System presents the jobs running on your cluster and the tasks that make up a specific job as a sortable list, along with histograms and line charts that represent the distribution of a particular metric. You can sort the list by the metric you're interested in to quickly find any outliers, then display specific detailed information about a job or task attempt that you want to learn more about. The filtering capabilities of the MapR Control System enable you to narrow down the display of data to the ranges you're interested in.

The MapR Control System displays data using histograms (for jobs) and line charts (for jobs and task attempts). All histograms and charts are implemented in HTML5, CSS and JavaScript to enable display on your browser or mobile device without requiring plug-ins. The histograms presented by MapR Metrics divide continuous data, such as a range of job durations, into a sequence of discrete bins. For example, a range of durations from 0 to 10000 seconds could be presented as 20 individual bins that cover a 500-second band each. The height of the histogram's bar for each bin represents the number of jobs with a duration in the bin's range. The line charts in MapR Metrics display the trend over time for the value of a specific metric.

A Community Edition license for MapR displays basic information. The Enterprise Edition license provides sophisticated graphs, and histograms, providing access to trends and detailed statistics. Either license provides access to MapR Metrics from the MapR Control System and job table and task table REST API.

The following section provides information about the job metrics database, results filtering, metrics protocol buffers, and an example: