Configuring the Hive Storage Plugin

About this task

MapR Drill supports all the Hive versions supported by MapR (Hive 0.13, 1.0, and 1.2). Drill can work with only one version of Hive on a given cluster. To access Hive tables using custom SerDes or InputFormat/OutputFormat, all nodes running Drillbits must have the SerDes or InputFormat/OutputFormat JAR files in the following location: <drill_installation_directory>/jars/3rdparty

To query across multiple versions of Hive from Drill, install each version of Hive on a separate cluster. For example, you have Drill and Hive 0.13 deployed in a production cluster, while a customer is testing Hive 1.0 on a test cluster. Drill can query data from Hive tables on the test cluster as well as Hive tables on the production cluster. You need to define separate storage plugins, each corresponding to a specific Hive version of the metastore.

Configuring the Hive Remote Metastore

The remote Hive metastore configuration runs as a separate service outside of Hive. The metastore service communicates with the Hive database over JDBC. Point Drill to the Hive metastore service address, and provide the connection parameters in a Hive storage plugin configuration to configure a connection to Drill. In this procedure, you change the default Hive storage plugin configuration to match your MapR-FS environment.

Procedure

  1. Verify that Hive is running.
  2. Issue the following command to start the Hive metastore service on the system specified in the hive.metastore.uris: hive --service metastore
  3. Start the Drill Web Console.
  4. Select the Storage tab. If Web Console security is enabled, you must have administrator privileges to perform this step.
  5. In the list of disabled storage plugins in the Drill Web Console, click Update next to hive.
  6. Update these Hive storage plugin parameters to match the location of the Hive metastore URI, version, and location of Hive you are using:
    • "hive.metstore.uris"
    • "jdbc:<database>://<host:port>/<metastore database>"
    {
      "type": "hive",
      "enabled": false,
      "configProps": {
      "hive.metastore.uris": "",
      "javax.jdo.option.ConnectionURL": "jdbc:<database>://<host:port>/<metastore database>",
      "hive.metastore.warehouse.dir": "/tmp/drill_hive_wh",
      "fs.default.name": "file:///",
      "hive.metastore.sasl.enabled": "false"
      }
    }
  7. Change the default location of files to suit your environment. For example, change "fs.default.name": "file:///" to the MapR-FS location: maprfs:///
  8. To run Drill and Hive in a secure MapR cluster, do the following tasks; otherwise, just enable the storage plugin configuration
    1. Remove the following line from the configuration: "hive.metastore.sasl.enabled" : "false"
    2. Click Enable in the Web Console to enable the Hive storage plugin configuration.
    3. Add the following line to <DRILL_HOME>/conf/drill-env.sh on each Drill node and then restart the Drillbit service:
      export DRILL_JAVA_OPTS="$DRILL_JAVA_OPTS -Dmapr_sec_enabled=true -Dhadoop.login=maprsasl -Dzookeeper.saslprovider=com.mapr.security.maprsasl.MaprSaslProvider -Dmapr.library.flatclass"