MapR Metrics and Job Performance

The MapR Metrics service collects and displays detailed analytics about the tasks and task attempts that comprise your Hadoop job. You can use the MapR Control System to display charts based on those analytics and diagnose performance issues with a particular job.

For example, if a job lists 100% map task completion and 99% reduce task completion, you can filter the views in the MapR Control System to list only reduce tasks. Once you have a list of your job's reduce tasks, you can sort the list by duration to see if any reduce task attempts are taking an abnormally long time to execute, then display detailed information about those task attempts, including log files for those task attempts.

You can also use the Metrics displays to gauge performance. Consider two different jobs that perform the same function. One job is written in Python using pydoop, and the other job is written in C++ using Pipes. To evaluate how these jobs perform on the cluster, you can open two browser windows logged into the MapR Control System and filter the display down to the metrics you're most interested in while the jobs are running.

For more information, see Analyzing Job Metrics.