Best Practices
Disk Setup
It is not necessary to set up RAID (Redundant Array of
Independent Disks) on disks used by MapR-FS. MapR uses a script
called
disksetup
to set up
storage pools. In most cases, you should let MapR calculate storage
pools using the default stripe width
of two or three disks. If you anticipate a high volume of
random-access I/O, you can use the -W
option with
disksetup
to specify
larger storage pools of up to 8 disks each.
Setting Up MapR NFS
MapR uses version 3 of the NFS protocol. NFS version 4
bypasses the port mapper and attempts to connect to the default
port only. If you are running NFS on a non-standard port, mounts
from NFS version 4 clients time out. Use the -o
nfsvers=3
option to specify NFS version 3.
NIC Configuration
For high performance clusters, use more than one network interface card (NIC) per node. MapR can detect multiple IP addresses on each node and load-balance throughput automatically.
Isolating CLDB Nodes
In a large cluster (100 nodes or more) create CLDB-only nodes to ensure high performance. This configuration also provides additional control over the placement of the CLDB data, for load balancing, fault tolerance, or high availability (HA). Setting up CLDB-only nodes involves restricting the CLDB volume to its own topology and making sure all other volumes are on a separate topology. Because both the CLDB-only path and the non-CLDB path are children of the root topology path, new non-CLDB volumes are not guaranteed to keep off the CLDB-only nodes. To avoid this problem, set a default volume topology. See Setting Default Volume Topology.
Isolating ZooKeeper Nodes
For large clusters (100 nodes or more), isolate the ZooKeeper on nodes that do not perform any other function. Isolating the ZooKeeper node enables the node to perform its functions without competing for resources with other processes. Installing a ZooKeeper-only node is similar to any typical node installation, but with a specific subset of packages.
Setting Up RAID on the Operating System Partition
You can set up RAID on the operating system partition(s) or drive(s) at installation time, to provide higher operating system performance (RAID 0), disk mirroring for failover (RAID 1), or both (RAID 10), for example. See the following instructions from the operating system websites:
ExpressLane
MapR provides an express path (called ExpressLane) that works in
conjunction with the Fair Scheduler. ExpressLane is for small MapReduce jobs to run when all slots
are occupied by long tasks. Small jobs are only given this special treatment when the
cluster is busy, and only if they meet the criteria specified by the following
parameters in mapred-site.xml
:
Parameter |
Value |
Description |
---|---|---|
mapred.fairscheduler.smalljob.schedule.enable |
true |
Enable small job fast scheduling inside fair scheduler. TaskTrackers should reserve a slot called ephemeral slot which is used for smalljob if cluster is busy. |
mapred.fairscheduler.smalljob.max.maps |
10 |
Small job definition. Max number of maps allowed in small job. |
mapred.fairscheduler.smalljob.max.reducers |
10 |
Small job definition. Max number of reducers allowed in small job. |
mapred.fairscheduler.smalljob.max.inputsize |
10737418240 |
Small job definition. Max input size in bytes allowed for a small job. Default is 10GB. |
mapred.fairscheduler.smalljob.max.reducer.inputsize |
1073741824 |
Small job definition. Max estimated input size for a reducer allowed in small job. Default is 1GB per reducer. |
mapred.cluster.ephemeral.tasks.memory.limit.mb |
200 |
Small job definition. Max memory in mbytes reserved for an ephermal slot. Default is 200mb. This value must be same on JobTracker and TaskTracker nodes. |
MapReduce jobs that appear to fit the small job definition but are in fact larger than anticipated are killed and re-queued for normal execution.
HBase
-
The HBase write-ahead log (WAL) writes many tiny records, and compressing it would cause massive CPU load. Before using HBase, turn off MapR compression for directories in the HBase volume (normally mounted at
/hbase
. Example:hadoop mfs -setcompression off /hbase
-
You can check whether compression is turned off in a directory or mounted volume by using
hadoop mfs
to list the file contents. Example:hadoop mfs -ls /hbase
The letter
Z
in the output indicates compression is turned on; the letterU
indicates compression is turned off. Seehadoop mfs
for more information. - On any node where you plan to run both HBase and MapReduce, give more memory to the
FileServer than to the RegionServer so that the node can handle high throughput. For
example, on a node with 24 GB of physical memory, it might be desirable to limit the
RegionServer to 4 GB, give 10 GB to MapR-FS, and give the remainder to TaskTracker.
To change the memory allocated to each service, edit the
/opt/mapr/conf/warden.conf
file. See Resource Allocation for Jobs and Applications for more information.