NFS on an Enterprise Edition Cluster

At cluster installation time, plan which nodes should provide NFS access according to your anticipated traffic. For instance, if you need 5Gbps of write throughput and 5Gbps of read throughput, here are a few ways to set up NFS:

  • 12 NFS nodes, each of which has a single 1Gbe connection
  • 6 NFS nodes, each of which has a dual 1Gbe connection
  • 4 NFS nodes, each of which has a quad 1Gbe connection

You can also set up NFS on all file server nodes to enable a self-mounted NFS point for each node. Self-mounted NFS for each node in a cluster enables you to run native applications as tasks. You can mount NFS on one or more dedicated gateways outside the cluster (using round-robin DNS or behind a hardware load balancer) to allow controlled access.

NFS and Virtual IP addresses

You can set up virtual IP addresses (VIPs) for NFS nodes in an Enterprise Edition-licensed MapR cluster, for load balancing or failover. VIPs provide multiple addresses that can be leveraged for round-robin DNS, allowing client connections to be distributed among a pool of NFS nodes. VIPs also enable high availability (HA) NFS. In a HA NFS system, when an NFS node fails, data requests are satisfied by other NFS nodes in the pool. Use a minimum of one VIP per NFS node per NIC that clients will use to connect to the NFS server. If you have four nodes with four NICs each, with each NIC connected to an individual IP subnet, use a minimum of 16 VIPs and direct clients to the VIPs in round-robin fashion. The VIPs should be in the same IP subnet as the interfaces to which they will be assigned. See Setting Up VIPs for NFS for details on enabling VIPs for your cluster.

Here are a few tips:

  • Set up NFS on at least three nodes if possible.
  • All NFS nodes must be accessible over the network from the machines where you want to mount them.
  • To serve a large number of clients, set up dedicated NFS nodes and load-balance between them. If the cluster is behind a firewall, you can provide access through the firewall via a load balancer instead of direct access to each NFS node. You can run NFS on all nodes in the cluster, if needed.
  • To provide maximum bandwidth to a specific client, install the NFS service directly on the client machine. The NFS gateway on the client manages how data is sent in or read back from the cluster, using all its network interfaces (that are on the same subnet as the cluster nodes) to transfer data via MapR APIs, balancing operations among nodes as needed.
  • Use VIPs to provide High Availability (HA) and failover.

To add the NFS service to a running cluster, use the instructions in Managing Roles on a Node to install the mapr-nfs package on the nodes where you would like to run NFS.