TaskTracker Configuration
When changing any parameters in this section, a TaskTracker restart is required.
mapreduce.tasktracker.prefetch.maptasks
is
greater than 0, you must disable Fair Scheduler with preemption and label-based job placement.
Parameter |
Description |
---|---|
mapred.tasktracker.map.tasks.maximum |
The maximum number of map task slots to run simultaneously. The default value of -1 specifies that the number of map task slots is based on the total amount of memory reserved for MapReduce by the Warden. For more information, see Resource Allocation for Jobs and Applications. Default value: |
mapreduce.tasktracker.prefetch.maptasks |
The proportion of map tasks that can be scheduled in advance (prefetched) on a TaskTracker. The number is given as a ratio of prefetched tasks to the total number of map slots. For example, 0.25 means the number of prefetched tasks = 25% of the total number of map slots. The default is 0.0, which means no prefetched tasks can be scheduled. Default value: |
mapreduce.tasktracker.reserved.physicalmemory.mb.low |
This property's value sets the target memory usage level when the
TaskTracker kills tasks to reduce total memory usage. This property's value
represents a percentage of the amount in the
Default value: |
mapreduce.tasktracker.task.slowlaunch |
Set this property's value to True to wait after each task launch for nodes running critical services like CLDB, JobTracker, and ZooKeeper. Default value: |
mapreduce.tasktracker.volume.healthcheck.interval |
This property's value defines the frequency in milliseconds that the TaskTracker checks the Mapreduce volume defined in the ${mapr.localvolumes.path}/mapred/ property. Default value: |
mapreduce.use.maprfs |
Use MapR-FS for shuffle and sort/merge. Default value: |
mapred.userlog.retain.hours |
This property's value specifies the maximum time, in hours, to retain the user-logs after job completion. Default value: |
mapred.userlog.retain.hours.max |
This property's value specifies the highest legal value for the
Default value: |
mapred.user.jobconf.limit |
The maximum allowed size of the user jobconf. The default is set to 5 MB. Default value: |
mapred.userlog.limit.kb |
Deprecated: The maximum size of user-logs of each task in KB. 0 disables the cap. Default value: 0 |
mapreduce.use.fastreduce |
Expert: Merge map outputs without copying. Default value: |
mapred.tasktracker.reduce.tasks.maximum |
The maximum number of reduce task slots to run simultaneously. The default value of -1 specifies that the number of reduce task slots is based on the total amount of memory reserved for MapReduce by the Warden. For more information, see Resource Allocation for Jobs and Applications. Default value: |
mapred.tasktracker.ephemeral.tasks.maximum |
Reserved slot for small job scheduling Default value: |
mapred.tasktracker.ephemeral.tasks.timeout |
Maximum time in milliseconds a task is allowed to occupy ephemeral slot Default value: |
mapred.tasktracker.ephemeral.tasks.ulimit |
Ulimit (bytes) on all tasks scheduled on an ephemeral slot Default value: |
mapreduce.tasktracker.reserved.physicalmemory.mb |
Maximum phyiscal memory TaskTracker should reserve for mapreduce tasks. If tasks use more than the limit, task using maximum memory will be killed. Expert only: Set this value only if TaskTracker should use a certain amount of memory for mapreduce tasks. In MapR Distro warden figures this number based on services configured on a node. Setting mapreduce.tasktracker.reserved.physicalmemory.mb to -1 will disable physical memory accounting and task management. |
mapred.tasktracker.expiry.interval |
Expert: This property's value specifies a time interval in milliseconds. After this interval expires without any heartbeats sent, a TaskTracker is marked lost. Default value: |
mapreduce.tasktracker.heapbased.memory.management |
Expert only: If the admin wants to prevent swapping by not launching too many tasks, use this option. Task's memory usage is based on max java heap size (-Xmx). By default, -Xmx will be computed by the TaskTracker based on slots and memory reserved for mapreduce tasks. See mapred.map.child.java.opts/mapred.reduce.child.java.opts. Default value: |
mapreduce.tasktracker.jvm.idle.time |
If JVM is idle for more than mapreduce.tasktracker.jvm.idle.time (milliseconds) TaskTracker will kill it. Default value: |
mapred.max.tracker.failures |
The number of task failures on a TaskTracker of a given job after which new tasks of that job aren't assigned to it. Default value: |
mapred.max.tracker.blacklists |
The number of blacklists for a TaskTracker by various jobs after which the TaskTracker could be blacklisted across all jobs. The TaskTracker will be given tasks later (after a day). The TaskTracker will become healthy after a restart. Default value: |
mapred.task.tracker.http.address |
This property's value specifies the HTTP server address and port for the TaskTracker. Specify 0 as the port to make the server start on a free port. Default value: |
mapred.task.tracker.report.address |
The IP address and port that TaskTrackeer server listens on. Since it is only connected to by the tasks, it uses the local interface. EXPERT ONLY. Only change this value if your host does not have a loopback interface. Default value: |
mapreduce.tasktracker.group |
Expert: Group to which TaskTracker belongs. If LinuxTaskController is
configured via the Default value: |
mapred.tasktracker.task-controller.config.overwrite |
The
Default value: |
mapred.tasktracker.indexcache.mb |
This property's value specifies the maximum amount of memory allocated by the TaskTracker for the index cache. The index cache is used when the TaskTracker serves map outputs to reducers. Default value: |
mapred.tasktracker.instrumentation |
Expert: The instrumentation class to associate with each TaskTracker. Default value:
|
mapred.task.tracker.task-controller |
This property's value specifies the TaskController that launches and manages task execution. Default value:
|
mapred.tasktracker.taskmemorymanager.killtask.maxRSS |
Set this property's value to True to kill tasks that are using maximum
memory when the total number of MapReduce tasks exceeds the limit specified
in the TaskTracker's
Default value: |
mapred.tasktracker.taskmemorymanager.monitoring-interval |
This property's value specifies an interval in milliseconds that
TaskTracker waits between monitoring the memory usage of tasks. This
property is only used when tasks memory management is enabled by setting the
property Default value: |
mapred.tasktracker.tasks.sleeptime-before-sigkill |
This property's value sets the time in milliseconds that the TaskTracker waits before sending a SIGKILL to a process after it has been sent a SIGTERM. Default value: |
mapred.temp.dir |
A shared directory for temporary files. Default value: |
mapreduce.cluster.map.userlog.retain-size |
This property's value specifies the number of bytes to retain from map task logs. The default value of -1 disables this feature. |
mapreduce.cluster.reduce.userlog.retain-size |
This property's value specifies the number of bytes to retain from reduce task logs. The default value of -1 disables this feature. |
mapreduce.heartbeat.10000 |
This property's value specifies a heartbeat time in milliseconds for a medium cluster of 1001 to 10000 nodes. Scales linearly between 10s - 100s. Default value: |
mapreduce.heartbeat.1000 |
This property's value specifies a heartbeat time in milliseconds for a medium cluster of 101 to 1000 nodes. Scales linearly between 1s - 10s. Default value: |
mapreduce.heartbeat.100 |
This property's value specifies a heartbeat time in milliseconds for a medium cluster of 11 to 100 nodes. Scales linearly between 300ms - 1s. Default value: |
mapreduce.heartbeat.10 |
This property's value specifies a heartbeat time in milliseconds for a medium cluster of 1 to 10 nodes. Default value: |
mapreduce.job.complete.cancel.delegation.tokens |
Set this property's value to False to prevent unregister or cancel delegation tokens from renewing. Default value: True |
mapreduce.jobtracker.inline.setup.cleanup |
Set this property's value to True to make the JobTracker attempt to set up and clean up the job by itself or do it in setup/cleanup task. Default value: False |