Configuring a MapR Gateway on an Independent Node for Elasticsearch Indexing

This topic provides instructions for setting up MapR Gateway on a node for the purpose of replicating MapR-DB binary tables to Elasticsearch indexes. This configuration procedure is applicable for independent gateway nodes and gateway nodes on a Elasticsearch cluster.

About this task

This procedure is applicable to the gateway installed on a node that was not previously part of the MapR cluster. Only the gateway services are installed on this new node. However, these nodes logically become part of the MapR cluster, as indicated by the dotted line extending from this cluster. Because gateways consume CPU and network resources, placing gateways on dedicated nodes allows for higher throughput rates than the previous configuration.

The following diagram shows an independent gateway node.

Procedure

  1. Install the mapr-gateway package on a node.
    When you install the mapr-gateway package, the following dependant packages are also installed: mapr-core, mapr-hadoop-core, mapr-mapreduce1, mapr-mapreduce2, maps-core-internal.
  2. Run configure.sh using the following options.
    /opt/mapr/server/configure.sh 
        -C <source cluster cldb list> 
        -Z <source cluster zk list> 
        -u <user> 
        -g <group> 
        -N <source cluster name>
    NOTE: The node is added as a gateway only node. It is not a client node. In this case, using the -c (small c) option is not allowed!
  3. Start the warden service.
    service maps-warden start
    NOTE: The gateway is managed by warden. Now the independent gateway node behaves like a client node which can talk to the source cluster. The node is logically part of the source cluster but does not contribute any data services except the gateway.
  4. Ensure that the Elasticsearch cluster is registered with the MapR source cluster. Elasticsearch is registered by running the /opt/mapr/bin/register-elasticsearch script on the MapR cluster node. The node that the script is run on can be the independent gateway node or any node in the source cluster. For more information, see Registering Elasticsearch Clusters with MapR Clusters.
    The script copies the Elasticsearch cluster’s configuration file (elasticsearch.yml), JAR files, and plugin JAR files into MapR-FS.
  5. Specify the cluster gateway IP addresses with the maprcli cluster gateway set command in the following manner to let the source cluster know about the gateway running on this node.
    • With the -dstcluster option, set the destination cluster as the source cluster. By doing so, the independent gateway becomes part of the source cluster for purposes of replicating table data.
    • With the -gateways option, the IP addresses that you provide depends on your configuration. If you have additional gateways on the source cluster that you want to use, provide all of the source cluster gateway IP addresses including the independent gateway. If you only want to use the independent gateway, provide only the independent gateway IP address.
    maprcli cluster gateway set 
    -dstcluster < source cluster name > 
    -gateways < IP addresses for source cluster gateways >
    NOTE:
    • If all source cluster gateways are specified for the -gateways option, any (or all) of the gateways can be used to replicate tables to Elasticsearch indexes.
    • If only the independent gateway is specified for the -gateways option, only that gateway will be used to replicate tables to Elasticsearch indexes and to perform other table replication tasks.
    NOTE: This command can be run on any source cluster node including the independent gateway node!