Best Practices

Disk Setup

It is not necessary to set up RAID (Redundant Array of Independent Disks) on disks used by MapR-FS. MapR uses a script called disksetup to set up storage pools. In most cases, you should let MapR calculate storage pools using the default stripe width of two or three disks. If you anticipate a high volume of random-access I/O, you can use the -W option with disksetup to specify larger storage pools of up to 8 disks each.

Setting Up MapR NFS

MapR uses version 3 of the NFS protocol. NFS version 4 bypasses the port mapper and attempts to connect to the default port only. If you are running NFS on a non-standard port, mounts from NFS version 4 clients time out. Use the -o nfsvers=3 option to specify NFS version 3.

NIC Configuration

For high performance clusters, use more than one network interface card (NIC) per node. MapR can detect multiple IP addresses on each node and load-balance throughput automatically.

Isolating CLDB Nodes

In a large cluster (100 nodes or more) create CLDB-only nodes to ensure high performance. This configuration also provides additional control over the placement of the CLDB data, for load balancing, fault tolerance, or high availability (HA). Setting up CLDB-only nodes involves restricting the CLDB volume to its own topology and making sure all other volumes are on a separate topology. Because both the CLDB-only path and the non-CLDB path are children of the root topology path, new non-CLDB volumes are not guaranteed to keep off the CLDB-only nodes. To avoid this problem, set a default volume topology. See Setting Default Volume Topology.

Isolating ZooKeeper Nodes

For large clusters (100 nodes or more), isolate the ZooKeeper on nodes that do not perform any other function. Isolating the ZooKeeper node enables the node to perform its functions without competing for resources with other processes. Installing a ZooKeeper-only node is similar to any typical node installation, but with a specific subset of packages.

WARNING: Do not install the FileServer package on an isolated ZooKeeper node in order to prevent MapR from using this node for data storage.

Setting Up RAID on the Operating System Partition

You can set up RAID on the operating system partition(s) or drive(s) at installation time, to provide higher operating system performance (RAID 0), disk mirroring for failover (RAID 1), or both (RAID 10), for example. See the following instructions from the operating system websites:

ExpressLane

MapR provides an express path (called ExpressLane) that works in conjunction with the Fair Scheduler. ExpressLane is for small MapReduce jobs to run when all slots are occupied by long tasks. Small jobs are only given this special treatment when the cluster is busy, and only if they meet the criteria specified by the following parameters in mapred-site.xml:

Parameter

Value

Description

mapred.fairscheduler.smalljob.schedule.enable

true

Enable small job fast scheduling inside fair scheduler. TaskTrackers should reserve a slot called ephemeral slot which is used for smalljob if cluster is busy.

mapred.fairscheduler.smalljob.max.maps

10

Small job definition. Max number of maps allowed in small job.

mapred.fairscheduler.smalljob.max.reducers

10

Small job definition. Max number of reducers allowed in small job.

mapred.fairscheduler.smalljob.max.inputsize

10737418240

Small job definition. Max input size in bytes allowed for a small job. Default is 10GB.

mapred.fairscheduler.smalljob.max.reducer.inputsize

1073741824

Small job definition. Max estimated input size for a reducer allowed in small job. Default is 1GB per reducer.

mapred.cluster.ephemeral.tasks.memory.limit.mb

200

Small job definition. Max memory in mbytes reserved for an ephermal slot. Default is 200mb. This value must be same on JobTracker and TaskTracker nodes.

MapReduce jobs that appear to fit the small job definition but are in fact larger than anticipated are killed and re-queued for normal execution.

HBase

  • The HBase write-ahead log (WAL) writes many tiny records, and compressing it would cause massive CPU load. Before using HBase, turn off MapR compression for directories in the HBase volume (normally mounted at /hbase. Example:

    hadoop mfs -setcompression off /hbase
  • You can check whether compression is turned off in a directory or mounted volume by using hadoop mfs to list the file contents. Example:

    hadoop mfs -ls /hbase

    The letter Z in the output indicates compression is turned on; the letter U indicates compression is turned off. See hadoop mfs for more information.

  • On any node where you plan to run both HBase and MapReduce, give more memory to the FileServer than to the RegionServer so that the node can handle high throughput. For example, on a node with 24 GB of physical memory, it might be desirable to limit the RegionServer to 4 GB, give 10 GB to MapR-FS, and give the remainder to TaskTracker. To change the memory allocated to each service, edit the /opt/mapr/conf/warden.conf file. See Resource Allocation for Jobs and Applications for more information.