Configuring Drill
Drill is highly configurable. This document focuses on MapR-related configurations and refers to the open source Apache Drill documentation for generic information. Key things to configure are:
- Drill memory Determine the amount of direct memory allocated to a Drillbit for query processing in a Drill cluster
- Resources for a shared Drillbit Configure queues and parallelization for supporting multiple users sharing a Drillbit. Support separate Drillbits running on different nodes in the cluster.
- Multitenancy
- User impersonation
- User authentication
- Drill impersonation with Hive authorization
- Volumes to use for spooling Use the drill.exec.sort.external.spill.directories option to set MapReduce volumes or local volumes for spooling to improve performance and stripe data across as many disks as possible.
- Persistent configuration storage
- Access rights
Drill typically runs along side other workloads, including the following:
- MapReduce
- Yarn
- HBase
- Hive and Pig
- Spark
You need to plan and configure these resources for use with Drill and other workloads:
- Memory
- CPU
- Disk
Configuring Access Rights
If the security in your organization limits access to MapR-DB/HBase tables, you might experience a problem querying the tables. If you have 777 file-level permissions to a table, yet a query returns no results, you might need to add your user name to the maprcli access list (ACL).