Disk Space Balancer

The disk space balancer is a tool that balances disk space usage on a cluster by moving containers between nodes (in the same topology) to ensure that the percentage of space used on all the disks in the cluster is similar. The disk space balancer distributes containers to storage pools on other nodes (in the same topology) that have lower utilization than the average for that cluster. The disk space balancer checks every storage pool on a regular basis and moves containers from a storage pool when that pool's utilization meets the following conditions:

  • The storage pool is over 70% full.
  • The storage pool's utilization exceeds the average utilization on the cluster by a specified threshold:
    • When the average cluster storage utilization is below 80%, the threshold is 10%.
    • When the average cluster storage utilization is below 90% but over 80%, the threshold is 3%.
    • When the average cluster storage utilization is below 94% but over 90%, the threshold is 2%.

You can view the disk usage of all the nodes in a cluster from the Disks view in the MCS. In the Navigation pane of the MCS, click Cluster > Nodes and then select Disks from the dropdown.

Disk Space Balancer Status

You can use the maprcli dump balancerinfo command to view detailed information about the storage pools on a cluster.

Example:
# maprcli dump balancerinfo
usedMB  fsid                 spid                              percentage  outTransitMB  inTransitMB  capacityMB
209     5567847133641152120  01f8625ba1d15db7004e52b9570a8ff3  1           0             0            15200
209     1009596296559861611  816709672a690c96004e52b95f09b58d  1           0             0            15200
If there are any active container moves when you run the command, maprcli dump balancerinfo returns information about the source and destination storage pools:
# maprcli dump balancerinfo -json
....
{
                       "containerid":7840,
                       "sizeMB":15634,
                       "From fsid":8081858704500413174,
                       "From IP:Port":"10.50.60.64:5660-",
                       "From SP":"9e649bf0ac6fb9f7004fa19d200abcde",
                       "To fsid":3770844641152008527,
                       "To IP:Port":"10.50.60.73:5660-",
                       "To SP":"fefcc342475f0286004fad963f0fghij"
               }

Disk Space Balancer Metrics

You can use the maprcli dump balancermetrics command to see a cumulative count of container moves and MB of data moved between storage pools since the current CLDB became the the master CLDB.

Example:
# maprcli dump balancermetrics -json
{
    "timestamp":1337770325979,
    "status":"OK",
    "total":1,
    "data":[
        {
            "numContainersMoved":10090,
            "numMBMoved":3147147,
            "timeOfLastMove": "Wed May 23 03:51:44 PDT 2012"
        }
    ]
}