Setting Up Multi-Master Replication Automatically
You can run a command to instruct MapR-DB to set up multi-master replication from an existing source table.
Prerequisites
- Configure one or more gateways in the destination cluster. See MapR Gateways.
- If the source and destination clusters are secured, set up security for replication between the clusters. See Configuring MapR Clusters for Replication Between Tables.
- Run the
maprcli table info
command on the source table to verify that you have the following permissions:-
readperm
, which is required for reading from the table. -
replperm
, which is required for replicating from the table.
-
- Run the
maprcli table info
command on the destination table (if it already exists) to verify that you have the following permissions:-
bulkload
, which is required for the initial copy of source data into the destination table. -
replperm
, which is required for receiving replicated updates from the source table.
-
Procedure
- Log into both the source and destination clusters.
-
Run the
maprcli table replica autosetup
command, which performs these steps for you:- Create a table on the destination cluster. This table has the same column families as the source table.
- Declare the new table to be a replica of the source table.
- Declare the source table as an upstream source for the replica.
- Load a copy of the source data into the replica.
- Declare the source table to be a replica of the new table.
- Declare the new table to be an upstream source for the source table.
- Start replication.
Results
maprcli
, you can set up this
replication topology in the MapR Control Service (MCS). Log into MCS and select
MapR Tables in the navigation menu. Select a table to be
the source table and click the Replicas tab. The actions for
setting up replication are in this location. Example
maprcli table replica autosetup -path <path to source table> -replica <path to replica> -multimaster yes
The
parameter -multimaster
is an optional parameter that you use to set
up multimaster replication.
For example, to set up replication between the
customers
table in the sanfrancisco
cluster
and a new customers
table in the newyork
cluster,
you could use this command:
maprcli table replica autosetup -path /mapr/sanfrancisco/customers -replica /mapr/newyork/customers -multimaster yes
customersA
table in the
sanfrancisco
cluster and a new customersB
table in the same cluster, you could use this
command:maprcli table replica autosetup -path /mapr/sanfrancisco/customersA -replica /mapr/sanfrancisco/customersB -multimaster yes
- -columns
- The value is a comma-separated list of items with the following
syntax:
<column family> <column family>:<column>
For example, to replicate only the column familypurchases
and the columnstars
in thereviews
column family, the value would look like this:-columns purchases,reviews:stars
- -synchronous
- This parameter specifies whether replication is synchronous or
asynchronous. Asynchronous is the default. The values are
yes
for synchronous andno
for asynchronous.
What to do next
If one of the tables goes offline, direct client applications to the other table. When the offline table comes back online, replication between the two tables continues automatically. When both tables are again back in synch, you can redirect client applications back to the original table.
For example, suppose that client applications are using the
customers
table that is in the cluster
sanfrancisco
.
The customers
table in the sanfrancisco
cluster
becomes unavailable, so you redirect those client applications to the
customers
table in the newyork
cluster.
After the customers
table in the sanfrancisco
cluster comes back online, replication back to it starts immediately. Because client
applications are not yet using this table, there are no updates to replicate to the
table in the newyork
cluster.
When the customers
table in the sanfrancisco
cluster is in synch with the other table, you can redirect your client applications
back to it.
Be aware that changes to the structure of a source table are not replicated
automatically to replicas. For example, if a new column family is added to the
source table and the entire table is being replicated (i.e. the maprcli
table replica add
command did not specify column families or columns to
replicate), the new column family is not automatically created at the replica.
You can add the new column family to the replica only if the entire source table is being replicated, then updates to the new column family will immediately start being replicated. You do not need to carry out the next steps. Continue only if you are replicating a subset of column families and columns.
If you are replicating a subset of column families and columns, follow these steps to add a new column family to the replica:
- Pause replication by running the
maprcli table replica pause
command. - Add the new column family to the replica by running the
maprcli table replica edit
command. - Copy the data for this column family from the source table into the replica by
using the CopyTable utility. Use the
-columns
parameter to specify the name of the column family. - Resume replication by running the
maprcli table replica resume
command.
Check for alarms related to replication and whether synchronous replication is switched temporarily to asynchronous replication by looking in MCS. See Table-Replication Alarms.