Replicating to a MapR Cluster that is also Indexing Data in Elasticsearch

You can also replicate from a MapR cluster to a cluster that is indexing data in Elasticsearch. Use the maprcli cluster gateway set command if you want to use one subset of your gateways for table replication and the other subset for indexing.

For example, suppose that you want to replicate from the source MapR cluster sanfrancisco to the destination MapR cluster newyork. You also want to index table data in Elasticsearch cluster newyork_es. You envision a configuration like this one:

As in the next diagram, you configure four gateways in the cluster newyork, planning to use two for table replication and two for indexing, as depicted in this diagram of what you want your configuration to look like. For indexing, you plan to place the two gateways physically in the Elasticsearch cluster, though logically they will part of the cluster newyork.

However, as the next diagram shows, what will happen is that table replication will use all of the gateways, and so will indexing.

This configuration will not necessarily slow down the performance of the Elasticsearch nodes where the gateways are located. However, if you do notice any slow down as you test your configuration, you could try partitioning the gateways by using the maprcli cluster gateway set command. When you run this command on the sanfrancisco cluster, specify only gateways A and B. When you run this command on the newyork cluster, specify only gateways C and D. By using this method of specifying only a subset of the gateways on the newyork cluster both times, you partition the gateways so that they are either handling traffic for table replication or handling traffic for indexing, but not handling both.