Hive 2.1-1703 Release Notes
Below are release notes for the Hive component included in the MapR Converged Data Platform. You may also be interested in the Apache Hive 2.1.1 Release Notes or the Apache Hive homepage.
Hive Version | 2.1 |
Release Date | April 2017 |
MapR Version Interoperability | See Hive and HCatalog Support Matrix and Ecosystem Support Matrix (Pre-5.2 releases) |
Source on GitHub | https://github.com/mapr/hive/tree/2.1.1-mapr-1703 |
Maven Artifacts | See Maven Artifacts for the HPE Ezmeral Data Fabric. |
Package Names | See Package Names for Ecosystem Packs (EEPs) |
API for this Version | See Hive 2.1 API |
New in This Release
This version of Hive includes the following:
- Hive Hybrid Procedural SQL On Hadoop (HPL/SQL)
Hive Hybrid Procedural SQL On Hadoop (HPL/SQL), which is available in Hive 2.1, is a tool that implements procedural SQL for Hive.
HPL/SQL is an open source tool that implements procedural SQL language for Apache Hive, SparkSQL, Impala, as well as any other SQL-on-Hadoop implementation, any NoSQL, and any RDBMS.
HPL/SQL is a hybrid and heterogeneous language that understands syntaxes and semantics of almost any existing procedural SQL dialect, and you can use with any database (for example, running existing Oracle PL/SQL code on Apache Hive and Microsoft SQL Server, or running Transact-SQL on Oracle, Cloudera Impala, or Amazon Redshift).
NOTE Create thehplsql-site.xml
file to configure HPL/SQL feature. See http://www.hplsql.org/configuration for more information. - Dynamically partitioned hash join for Tez.
- Support for aggregate push down through joins.
- DBTokenStore support to HS2 delegation token.
- Hive View Column Authorization.
- UDF substring_index
Returns the substring from string
str
before count occurrences of the delimiter. - Quarter UDF
The quarter from a string / date / timestamp returned by the QUARTER(date) function may be useful for different domains like retail, finance etc.
- Support for limited integer type promotion in ORC.
- ORC file dump in JSON format
ORC file dump uses custom format. Will be useful to dump ORC metadata in json format so that other tools can be built on top it.
- UDF
aes_encrypt
andaes_decrypt
with AES (Advanced Encryption Standard) algorithm.Oracle JRE supports AES-128 out of the box AES-192 and AES-256 are supported if Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files are installed.
- Possibility for Hive Parser to support multi col in clause (x,y..) in ((..),..., ()).
- Support of special characters in quoted table names.
- Support for "show create database".
- Support escaping carriage return and new line for LazySimpleSerDe.
- Banker's rounding BROUND UDF
With banker's rounding, the value is rounded to the nearest even number. Also known as "Gaussian rounding", and, in German, "mathematische Rundung".
- Command to kill an ACID transaction.
This cleans up all state related to this transaction. The initiator of this (if still alive) will get an error trying to heartbeat/commit and will become aware that the transaction failed.
- Support for modifying the numRows and dataSize for a table/partition.
- Support vectorizing when the input format is TEXTFILE and other formats for better Map Vertex performance.
- Support for NULLS FIRST/NULLS LAST.
The NULLS FIRST and NULLS LAST options can be used to determine whether nulls appear before or after non-null data values when the ORDER BY clause is used.
- Supports aggregate functions in over clause.
Fixes
This release by MapR includes the following fixes on the base Apache release. For complete details, refer to the commit log for this project in GitHub.
Commit | Date (YYYY-MM-DD) | Comment |
---|---|---|
3b83fea | 2017-01-22 | MAPR-26541: The variable $BASEMAPR will now be initialized by $HOME_MAPR from parent pid and if it cannot be defined, will be set to /opt/mapr by default. |
6ff94bc | 2017-02-28 | MAPR-25720: When restarting HS2, the issue that caused Session manager to delete operation_logs folder a second time after a huge delay is now fixed. |
e8a6f79 | 2017-02-23 | MAPR-26193: The issue that caused the "Permission Denied" message when launching a hive shell is now fixed. |
f69e9ee | 2017-02-17 | MAPR-25698: The missing log4j2.component.properties file is
now included with Hive and the log4j2.disable.jmx property value is
set to false by default to fix the AccessControlExceptionImport
error when importing from MySQL to Hive. |
7d3b630 | 2017-02-07 | MAPR-26169: The issue that caused the FileNotFoundException when there was no file with localPath (for example, no reduce work) is now fixed. |
8d40378 | 2017-02-07 | MAPR-25952: When starting Hive, the issue that caused the message about absence of hbase is now fixed. |
7afb69c | 2017-01-30 | MAPR-25938: The conflicts in the versions of included Sentry libraries which caused insert queries to fail with exception is now fixed. |
13f2e20 | 2017-01-25 | MAPR-25880: The missing HiveOperation field is now included in HiveSemanticAnalyzerHookContext to allow StateStore to acces the current HiveOperation. |
e1f5878 | 2017-01-26 | MAPR-25822: The issue that caused INSERT INTO 'table' VALUES command to overwrite previously inserted data is now fixed. |
Known Issues and Limitations
Known Issues
- Sqoop import to Hive as parquet file fails when the entire cluster is configured to use Tez.
- This is because of sqoop's incompatibility with Tez.
Workaround: Do not configure the entire cluster to use Tez.
- Percentage sampling is not supported in
org.apache.hadoop.hive.ql.io.HiveInputFormat
. Hive uses
org.apache.hadoop.hive.ql.io.HiveInputFormat
by default and so queries like'SELECT * FROM tablename TABLESAMPLE(20 percent);'
will not work for Hive on Tez.Workaround: Instead of
org.apache.hadoop.hive.ql.io.HiveInputFormat
, useorg.apache.hadoop.hive.ql.io.CombineHiveInputFormat
.To change input format, do one of the following:
- Set
hive.tez.input.format
in hive shell. For example:hive> set hive.tez.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
- Add
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
tohive-site.xml
file. For example:<property> <name>hive.tez.input.format</name> <value>org.apache.hadoop.hive.ql.io.CombineHiveInputFormat</value> </property>
- Set
Limitations
- MapR does not support Hive on Spark. Therefore, you cannot use Spark as an execution engine for Hive. However, you can run Hive and Spark on the same cluster. You can also use Spark SQL and Drill to query Hive tables.
- MapR does not support HDFS encryption in Hive tables.
- MapR does not support Hbase-0.9X with Hive-2.1.1. Only Hbase-1.X is compatible with Hive-2.1.1.
- MapR does not support LLAP with Hive-2.1.1 since Apache Slider is not in the MapR ecosystem
- MapR does not support Apache Knox and Apache Ranger. HiveServer2 HTTP mode is not available with X-Forwarded-Host header for authorization/audits.
- MapR does not support masking and filtering of rows/columns since Apache Ranger is not in the MapR ecosystem.
Resolved Issues
None.