Integrate Hue with Drill

Starting in EEP 6.0, Drill is officially supported with Hue. When you integrate Drill with Hue, users can run Drill queries from the Hue interface and visualize data.

Drill integrates with Hue through configuration options in the /opt/mapr/hue/hue-<version>/conf/hue.ini file. A user can authenticate to Hue through Plain, Kerberos, or data-fabric-SASL authentication. The user that authenticates to Hue is the user that runs the Drill queries from Hue.

When connecting to Drill, Hue performs outbound impersonation to Drill as the user that authenticated to Hue. Drill accepts the outbound impersonation from Hue as an inbound impersonation. Drill then performs outbound impersonation to the filesystem or MapR Database.

NOTE In a secure cluster, you can only access the Hue interface through HTTPS.
NOTE SSL encryption is currently not supported.

Prerequisites

Note the following prerequisites before you integrate Hue with Drill:
  • The cluster must have the latest versions of Hue, Drill, and HTTPFS installed. Hue uses HTTPFS to communicate with the file system. You can see the latest component versions in the Component Versions for Released EEPs. If you install Hue and Drill on a secure cluster, Hue and Drill are installed with the default security configurations and outbound impersonation is enabled.
    • Installer

      When you install Hue and Drill using the Installer, HTTPFS is installed automatically, and Hue is automatically configured to integrate Drill without having to perform any manual configuration. The installer configures a Zookeeper connection to Drill in hue.ini, by default.

    • Manual Installation
      • Install HTTPFS and then configure Hue, as described in the following section, Configuring Hue
  • In a secure cluster, Drill must have user impersonation enabled.
    • Drill has an inbound impersonation policy option, exec.impersonation.inbound_policies, that allows the Hue process user (proxy user) to impersonate the Hue authenticated user as an outbound impersonation from Hue to Drill. This option is automatically configured when Drill and Hue are installed using the Installer with default security enabled, or when you run configure.sh on a secure cluster. If you do not run configure.sh, you must manually add this option to the impersonation configuration in the /opt/mapr/drill/drill-<version>/conf/drill-override.conf file, as shown:
      impersonation.enabled: true,
        impersonation.max_chained_user_hops: 3,
        exec.impersonation.inbound_policies: "[{proxy_principals:{users:[\"mapr\"]},target_principals:{users:[\"*\"]}}]",
      
  • If you plan to use Kerberos for authentication, you will need to include the Hue keytab file and Kerberos principal name for Hue in the hue.ini file. If needed, complete the steps listed in the Creating a Kerberos Principal and Extracting the Kerberos Ticket from the keytab File sections on the Configure Hue to use Kerberos page.

Configuring Hue

If you manually installed Hue, Drill, and HTTPFS, you must modify the hue.ini file to include the configuration information needed for Hue to connect with Drill and HTTPFS. The hue.ini file contains sections where you configure Hue to integrate with various components, like Drill and HTTPFS. You can access the hue.ini file in the /opt/mapr/hue/hue-<version>/desktop/conf directory. Start/Restart the services after you update the hue.ini file.
NOTE If you installed Hue and Drill using the Installer, these options are populated automatically in the hue.ini file; no configuration is required. In a secure cluster, the authentication mechanism defaults to MAPR-SECURITY.
Complete the following steps to integrate Hue with Drill:
  1. Edit the Drill configuration in hue.ini.

    The hue.ini file contains a [[[drill]]] section under which you can see configuration options needed for Hue to connect with Drill. You must uncomment an option (remove the # character) in the hue.ini file for the option to take effect.

    The following tables list and describe the Drill options with possible values and also provide examples:
    Options Descriptions Examples
    connection_type=

    Tells Hue how to connect to Drill. Enter one of the following values:

    • direct

    • zookeeper

    A direct connection is a connection in which Hue connects directly to a Drillbit. A ZooKeeper connection is a connection in which Hue communicates with ZooKeeper and ZooKeeper provides Hue with a Drillbit to connect with.

    connection_type=zookeeper
    drillbits= Enter the node IP address of the Drillbit that Hue connects with. Only enter a node address if using the “direct” connection_type.

    drillbits=10.10.100.2:31010

    To list multiple Drillbits, separate each IP address by a comma, as shown:

    drillbits=10.10.100.2:31010,10.10.100.3:31010
    NOTE Port 31010 is the user port between nodes in a Drill cluster. This port is needed for an external client to connect into the cluster nodes and for the Drill Web Console.
    zk_quorum= Enter the list of ZooKeeper node IP addresses in the ZooKeeper quorum. Only enter the IP addresses for the ZooKeeper quorum if using the “zookeeper” connection_type. zk_quorum=10.10.100.3:5181, 10.10.100.4:5181, 10.10.100.5:5181
    zk_cluster_id= Enter the name of the Drill cluster that you want Hue to connect to. zk_cluster_id=dev-drillbits
    mechanism=

    The type of authentication enabled. Enter one of the following values:

    • None

    • GSSAPI

    • Data Fabric-SECURITY

    Use None for Plain authentication. Use GSSAPI for Kerberos authentication. Use MAPR-SECURITY for maprsasl.

    If you set the mechanism to “none” and impersonation is enabled, you must set the username and password to the admin or proxy user that will impersonate Hue end users. You can set these with the user= and password= options.

    If you set the mechanism to GSSAPI, you must also include the ccache_path= option. For this option, enter the caching location for Kerberos credentials, for example: ccache_path=/tmp/hue_krb5_ccache

    You can set the Drill Kerberos principal and/or Hue impersonation using the option named “options=.” See “options=” below.

    mechanism=none
    user=

    If using Plain authentication, enter the username. If using another authentication mechanism, do not enter a value.

    Set the username to the admin or proxy user that will impersonate Hue end users.

    If impersonation is disabled, the you can set the user to any user. Hue will connect to Drill as the user specified.

    user=mapr
    password=

    If using Plain authentication, enter the password. If using another authentication mechanism, do not enter a value.

    Set the password for the admin or proxy user that will impersonate Hue end users.

    If impersonation is disabled, set the password for the use specified.

    password=mapr8
    password_script=

    Indicates which script to run for the database password when a password is required and the password= option is not set. Enter the location of the script.

    The following shell script is an example of a password script:

    #!/bin/bash

    case $1 in

    drill) echo "password_1";;

    some-output) echo "password_2";;

    *) echo "wrong argument" >&2 exit 1;;

    esac

    password_script='/root/hue_password_script/password_script.sh drill'
    options=

    Additional options related to impersonation and Kerberos authentication. This option takes the following values:

    • impersonation

    • principal

    Impersonation enables or disables outbound impersonation in Hue. Principal is the Drill service principal when Kerberos authentication is enabled.

    options='{"impersonation": true, "principal": "mapr/localhost@REALM"}'

    options='{"impersonation": true}'

    options='{"impersonation": false}'

  2. Add the HTTPFS URL in hue.ini.

    The hui.ini file contains a [[[default]]] section in the [hadoop] block under which you can see HDFS configuration options. You must uncomment an option (remove the # character) in the hue.ini file for the option to take effect.

    In the [[[default]]] section of the [hadoop] block, enter the IP address of the HTTPFS node as the value for the webhdfs_url= option, as shown:

    # Use WebHdfs/HttpFs as the communication mechanism.
    # Domain should be the NameNode or HttpFs host.
    # Default port is 14000 for HttpFs.
    webhdfs_url=https://<httpfs-node-ip-address>:14000/webhdfs/v1  
    
  3. Start the services.
    Start/Restart Hue, Drill, and HTTPS to apply the updated configurations, as shown in the following examples:
    maprcli node services -name hue -action start -nodes <hue-node-ip-address>
    
    maprcli node services -name drill-bits -action start -nodes <list-of-drill-node-ip-addresses>
    
    maprcli node services -name httpfs -action start -nodes <httpfs-node-ip-address>
    

Run Drill Queries in Hue

Once you have configured Hue and started the services, you can run Drill queries from Hue and visualize your data.

Complete the following steps to run Drill queries in Hue:
  1. In your web browser, enter the Hue URL to navigate to the Hue web interface, as shown:
    http://hue-node-ip-address:8888
  2. If prompted, enter your user credentials. The Hue interface opens.
  3. In the Query drop-down, select Editor > Drill. The left navigation panel displays the list of schemas available in Drill.
  4. Select a schema, for example dfs.default, and then enter a query in the text field.
  5. Click the blue play button to execute the query. Query results display.
  6. Optionally, you can use the buttons to the left of the query results to visualize the data.