Configuring MapR Clusters for Table Replication

Configuring clusters for participation in replication topologies involves configuring two or more gateways on destination clusters and, if the clusters are secure, setting up secure communications between the clusters.

Prerequisites

  • Plan which replication topology you want to use.
  • Ensure that your user ID has the readAce permission on the volume where the source tables are located and the writeAce permission on the volumes where the replicas are located. For information about how to set permissions on volumes, see Setting/Modifying Whole Volume ACEs.
  • Ensure that you have administrative authority on the clusters that you plan to use.
  • In the mapr-clusters.conf file on every node in your source cluster, add an entry that lists the CLDB nodes that are in the destination cluster. This step is required for using maprcli commands, MapR Control Service (MCS), and the utilities, which bypass gateways and communicate directly with the destination cluster's CLDB nodes. See the topic "mapr-clusters.conf” for the format to use for the entries.
  • In the mapr-clusters.conf file on every node in your destination cluster, add an entry that lists the CLDB nodes that are in the source cluster. This step is required for running maprcli commands, as when you add upstream sources to replicas or when the destination cluster also serves as a source cluster in bi-directional replication or multi-master replication. See the topic “mapr-clusters.conf” for the format to use for the entry.
  • If you upgraded your source cluster from a previous version of MapR, enable table replication by running this maprcli command: maprcli cluster feature enable -name mfs.feature.db.repl.support
  • If you plan to use MCS to set up and manage table replication, install a supported version of HBase from http://package.mapr.com/releases/ecosystem-5.x/ on the nodes running MCS in the source cluster. To see which versions are supported, see HBase Support Matrix. If the webserver is running on those nodes after you install HBase, restart it so that it adds the mapr-hbase jars to its classpath.
  • If you plan to use the maprcli table replica autosetup command as you set up your replication topology, install mapr-hbase-1.1.1.201602221251-1.noarch.rpm on the nodes in either the source or destination cluster where you plan to run this command from. For information about the maprcli table replica autosetup command, see maprcli table replica autosetup and the procedures in the following topics.

Procedure

  1. On the destination cluster, configure gateways through which the source cluster will send updates to the destination cluster. See Configuring a MapR Gateway Master-Slave Topology.
  2. Optional: If your clusters are secure, configure the source cluster so that you can locally run maprcli commands that are executed on the destination cluster. See Running Commands on Remote Secure Clusters.
  3. If your clusters are secure, add a cross-cluster ticket to the source cluster, so that it can replicate data to the destination cluster. See Adding Cross-Cluster Tickets to Secure Clusters.
  4. Optional: If your clusters are secure, configure your source cluster so that you can use the MapR Control System (MCS) to set up and administer table replication from the source to the destination cluster.
    These steps make it convenient to use MCS for setting up and managing replication involving two secure clusters. However, before following them, perform these prerequisite tasks.
    NOTE:
    • Ensure that both clusters are managed by the same team or group. The UIDs and GIDs of the users that are able to log in to MCS on the source cluster must exactly match their UIDs and GIDs on the destination cluster. This restriction applies only to access to both clusters through MCS, and does not apply to access to both clusters through the maprcli. If the clusters are managed by different teams or groups, use the maprcli instead of MCS to set up and manage table replication involving two secure clusters.
    • Ensure that the proper file-system and table permissions are in place on both clusters. Otherwise, any user who can log into MCS and has the same UID or GID on the destination cluster will be able to set up replication either from the source cluster to the destination cluster or vice versa. A user could create one or more tables on the destination cluster, enable replication to them from the source cluster, load the new tables with data from the source cluster, and start replication. A user could also create tables on the source cluster, enable replication to them from tables in the destination cluster, load the new tables with data from the destination cluster, and start replication.
    1. On the source cluster, generate a service ticket by using the maprlogin command:
      maprlogin generateticket -type service -cluster <destination cluster>
      -user mapr -duration <duration> -out <output folder>

      Where -duration is the length of time before the ticket expires. You can specify the value in either of these formats:

      • [Days:]Hours:Minutes
      • Seconds
    2. To every node of the source cluster, add the service ticket to the file /opt/mapr/conf/mapruserticket file that was created when you secured the source cluster:
      at <path and filename of the service ticket> >> /opt/mapr/conf/mapruserticket
    3. Restart the webserver by running the maprcli node services command. For the syntax of this command, see node services.
    4. Add the following two properties to the core-site.xml file. For Hadoop 2.7.0, edit the file /opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop/core-site.xml.
      <property>
            <name>hadoop.proxyuser.mapr.hosts</name>
            <value>*</value>
       </property>
       <property>
            <name>hadoop.proxyuser.mapr.groups</name>
            <value>*</value>
       </property>