Example 1

On this example 10-node Enterprise Edition cluster, each node has the following characteristics:

  • 16 cores
  • 32 GB memory
  • 12 drives

Although 16 cores theoretically can run up to 28 map slots (2 each on 14 cores, with 2 cores reserved for the operating system and MapR-FS), there would be no memory left for reduce slots. After the operating system and MapR-FS take their share of the memory, there is approximately 24 GB left for MapReduce.

  • Set the chunk size to 256 MB (unless you are using application-level compression).
  • For that chunk size, io.sort.mb is 380 MB.
  • Set map task memory to 800 MB by adding -Xmx800m to mapred.map.child.java.opts.
  • 14 map slots (11.2 GB memory required)
  • Use the rest of memory for reducers:
    • 24 GB - 11.2 GB = 12.8 GB
    • 12.8 GB / 3.5 GB = approximately 4 reducers

To improve the ratio of mappers to reducers, consider 10 mappers and 5 reducers instead.