Mask Sensitive Data in Query Logs and Profiles

Starting in Drill 1.20.2 (EEP 9.0.0 installed on Core 7.1.0), you can define a set of rules in a JSON file to mask sensitive data in Drill query logs and query profiles.

Masking Data in Query Logs

Drill includes the following Logback encoder and layout classes that enable you to configure Drill logs such that data in the final message is masked:
  • org.apache.drill.logback.MaskingPatternEncoder
  • org.apache.drill.logback.MaskingPatternLayout
The Drill encoder and layout provide the same functions as the following encoder and layout:
  • ch.qos.logback.classic.encoder.PatternLayoutEncoder
  • ch.qos.logback.classic.PatternLayout
The following examples demonstrate how to configure the Drill masking pattern encoder and layout in the /opt/mapr/drill/drill-<version>/conf/logback.xml file:
Masking Pattern Encoder Example
<configuration>
  <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
    <encoder class="org.apache.drill.logback.MaskingPatternEncoder">
      <rulesConfig>${pathToJsonConfig}</rulesConfig>
      <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern>
    </encoder>
  </appender>

  <root>
    <level value="error" />
    <appender-ref ref="STDOUT" />
  </root>
</configuration>
Masking Pattern Layout Example
<configuration>
  <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
    <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
      <layout class="org.apache.drill.logback.MaskingPatternLayout">
        <rulesConfig>${pathToJsonConfig}</rulesConfig>
        <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern>
      </layout>
    </encoder>
  </appender>

  <root>
    <level value="error" />
    <appender-ref ref="STDOUT" />
  </root>
</configuration>

Both examples include the rulesConfig parameter. The rulesConfig parameter is where you include the path to a JSON file that defines the masking rules. Enter the absolute path to the JSON file; do not use a relative path.

For information about how to define masking rules in a JSON file, see Configuring Masking Rules in a JSON File.

Masking Data in Query Profiles

You can define rules that mask the following information in query profiles:

  • Query plan text
  • Queries
  • Errors
  • Verbose errors

To mask data in query profiles, define the masking rules in a JSON file and then set the drill.exec.query_profile.masking_rules.config_path parameter in the /opt/mapr/drill/drill-<version>/conf/drill-override.conf or /opt/mapr/drill/drill-<version>/conf/drill-distrib.conf file to point to the JSON file.

For information about how to define masking rules in a JSON file, see Configuring Masking Rules in a JSON File. For more information about Drill configuration files, see Drill Configuration Files.

Configuring Masking Rules in a JSON File

The JSON file that defines the masking rules must include an array of objects with the following fields:
  • search
  • replace
  • description
Use these fields to define the rules, as shown in the following examples:
[
  {
    "search": "([\\w\\d]+\\.)+(com)",
    "replace": "secret.domain.com",
    "description": "Mask domain names"
  },
  {
    "search": "MagicCompany",
    "replace": "TopSecretCompany",
    "description": "Mask company name"
  }
]
The following table describes each of the fields that define the masking rules:
Field Required Description Default
search Yes Defines the string or regex pattern to mask. If entering a regex pattern, use the correct escaping. Drill does not apply the rule when this field is empty, null, or omitted. -
replace No Defines the string that you want to mask the search string or regex pattern with. If you want to remove the search string or regex pattern, use "" to leave the space empty. ""
description No An optional field used to describe the search and replace rules. The description is not returned in the logs or query profiles. Empty