Example 1
On this example 10-node Enterprise Edition cluster, each node has the following characteristics:
- 16 cores
- 32 GB memory
- 12 drives
Although 16 cores theoretically can run up to 28 map slots (2 each on 14 cores, with 2 cores reserved for the operating system and MapR-FS), there would be no memory left for reduce slots. After the operating system and MapR-FS take their share of the memory, there is approximately 24 GB left for MapReduce.
- Set the chunk size to 256 MB (unless you are using application-level compression).
- For that chunk size,
io.sort.mb
is 380 MB. - Set map task memory to 800 MB by adding -Xmx800m to
mapred.map.child.java.opts
. - 14 map slots (11.2 GB memory required)
- Use the rest of memory for reducers:
- 24 GB - 11.2 GB = 12.8 GB
- 12.8 GB / 3.5 GB = approximately 4 reducers
To improve the ratio of mappers to reducers, consider 10 mappers and 5 reducers instead.