Example 5
On this example 1000-node cluster, each node has the following characteristics:
- 4 cores
- 32 GB memory
- 2 drives
This hardware has too few cores to accomplish much parallelism, but has a lot of memory. Because the hardware is memory-heavy and core-light, give as much memory as possible to MapR-FS; the map output is likely to fit in memory, reducing disk I/O.
- 4 map slots
- 1 reduce slot
- Chunk size: 256 MB
- Leftover 16 GB memory given to MapR-FS
- Set
mapred.reduce.slowstart.completed.maps
to 0