Registering Elasticsearch Clusters with MapR Clusters
To register an Elasticsearch cluster with a MapR cluster, you run a script that
copies the Elasticsearch cluster’s configuration file (elasticsearch.yml
),
JAR files, and plugin JAR files into MapR-FS on the MapR cluster where you run the script
from. When you run the script, you provide the IP address or hostname of the Elasticsearch
node to copy the files from.
Prerequisites
- Ensure that the Elasticsearch cluster is at version 1.4.
- Ensure that there are adequate resources in the Elasticsearch cluster to handle updates being sent from MapR-DB at high speeds and high volume. MapR-DB tables can ingest large numbers of puts very rapidly and send them to gateways also very rapidly.
- Ensure that there is enough storage space in the Elasticsearch cluster. Because the binary data in MapR-DB is converted to less compact JSON documents, storage requirements in the Elasticsearch cluster will exceed the storage requirements in the MapR-DB cluster for the same number of columns.
- Plan which nodes will serve as MapR gateways for communication between MapR-DB
and Elasticsearch, install the
mapr-gateway
package on those nodes, and notify the MapR cluster of the location of those nodes. See MapR Gateways. - Find out whether Elasticsearch was installed on the Elasticsearch cluster by means of a ZIP or TAR file, or by means of a Debian or RPM package. The installation method determines where the registration script looks for the files that it needs to copy.
- Ensure that your user ID has read permission on the Elasticsearch installation directory. If you will be specifying a different user ID, ensure that ID has read permission on this directory.
- Decide whether to use the actual name of the Elasticsearch cluster during the registration process or a different name. The name under which you register the cluster does not have to match the actual name. You cannot change this name later.
About this task
MapR-DB binary tables in MapR clusters can be indexed across multiple Elasticsearch clusters. For example, you could index one set of columns in one Elasticsearch cluster and another set of columns in another Elasticsearch cluster. supporting different applications by doing so.
Procedure
/opt/mapr/bin/register-elasticsearch
.
- -c
- Specify the name of the Elasticsearch cluster. This name is used only for registering the cluster with the MapR cluster and does not have to match the actual name.
- -r
- Provide either of the following values:
- If your MapR gateways are using node clients to
communicate with the Elasticsearch cluster, specify the
hostname or IP address of an Elasticsearch node from
which to copy the
elasticsearch.yml
file, Elasticsearch JAR files, and plugin JAR files.IMPORTANT: Before you run this script, ensure both that multicast is disabled in the elasticsearch.yml file and that the hostname or IP address of each MapR gateway node is included in the unicast node list in theelasticsearch.yml
file. - If your MapR gateways are using transport clients to
communicate with the Elasticsearch cluster, specify the
hostname or IP address of an Elasticsearch transport
node from which to copy the
elasticsearch.yml
file, Elasticsearch JAR files, and plugin JAR files, followed by the hostnames or IP addresses of any Elasticsearch nodes to use as additional transport nodes.
- If your MapR gateways are using node clients to
communicate with the Elasticsearch cluster, specify the
hostname or IP address of an Elasticsearch node from
which to copy the
- -t
- If you specified the second value listed for the -r option, include this parameter to notify the cluster to use the nodes listed in -r as transport nodes.
- -u
- Optional: Specify an alternative user ID for connecting to the Elasticsearch cluster by means of scp. The default is the current user ID.
- -e
- If Elasticsearch was installed by means of a ZIP or TAR file, specify the path to Elasticsearch’s installation directory on the Elasticsearch cluster. If Elasticsearch was installed with a Debian or RPM package, omit this parameter.
If you plan to run the script via your own script, include the
-y
parameter, which omits interactive prompts.
To see the help for the script, run the script with the -h
parameter only.
What to do next
Set up replication from one or more MapR-DB source binary tables to Elasticsearch types. See Configuring Replication to Elasticsearch Types.
If you want to list the Elasticsearch clusters that are registered with the current
MapR cluster, run the script with the -l
parameter only.
If you change the elasticsearch.yml file for the cluster, the Elasticsearch JAR files for the cluster, or both, you must re-register the Elasticsearch cluster with the MapR source cluster and restart the MapR gateways that you are using for indexing. Follow these steps:
- Pause indexing of your MapR-DB source binary tables. To get a list of the
Elasticsearch types that are used for each source table, use the
maprcli table replica elasticsearch list
command. For each Elasticsearch type, issue themaprcli table replica elasticsearch pause
command to pause indexing. - Re-register the Elasticsearch cluster by running the script
/opt/mapr/bin/register-elasticsearch
. Use the same parameters as you did when you first registered the Elasticsearch cluster. However, this time include the-f
parameter to force the registration. This parameter is necessary because you are not unregistering the cluster before registering it again. - Restart the MapR gateways that you are using for indexing. See the section "On clusters where gateways are running" in Configuring a MapR Gateway Master-Slave Topology.
- Resume indexing by issuing the command
maprcli table replica elasticsearch resume
for each Elasticsearch type that you are indexing your data in.
-c
: Specify the name that was used for the Elasticsearch cluster when it was registered.-d
: Specifies to delete the registration.