Converting Data to Elasticsearch Data Types That MapR-DB Does Not Support
If you want to convert your source data into an unsupported Elasticsearch data type, write Java routines to tell MapR-DB how to perform conversions.
About this task
You specify how to convert your data by writing a Java class that extends an interface provided by MapR. You then create a JAR file that contains this class and place the file anywhere on the source MapR cluster. When you set up replication from your source MapR-DB tables to destination indexes, you will use a configuration file to tell MapR-DB where the JAR file is located.
Procedure
-
Create a Java class to convert your data into the data types that you
established in Elasticsearch.This class must implement the interface
MapRESConverter
. -
In the
pom.xml
file for your Java project, add this dependency:<dependency>
<groupId>com.mapr.external</groupId>
<artifactId>external</artifactId>
<version><MapR version></version>
</dependency>
-
Create a JAR file that contains your class and the
pom.xml
file. You can give the JAR file any name that you prefer. -
Place the JAR file anywhere in the local Linux file system on the MapR node
where you plan to run the
maprcli table replica elasticsearch autosetup
command to map the source MapR table to the Elasticsearch type. You will point to this file when you configure the source MapR cluster for table replication to your Elasticsearch types. -
Create the destination type for each source table for which you want to use the
custom mapping. Use Elasticsearch’s
create index
API.
What to do next
If you have not done so already, register your Elasticsearch cluster or clusters with your MapR source cluster.
If you have already registered your Elasticsearch cluster, configure replication to types in Elasticsearch.
- Pause indexing of your MapR-DB source tables. To get a list of the
Elasticsearch types that are used for each source table, use the
maprcli table replica elasticsearch list
command. For each Elasticsearch type, issue themaprcli table replica elasticsearch pause
command to pause indexing. - Restart the MapR gateways that you are using for indexing. See the section "On clusters where gateways are running" in Configuring a MapR Gateway Master-Slave Topology.
- Resume indexing by issuing the command
maprcli table replica elasticsearch resume
for each Elasticsearch type that you are indexing your data in.