Upgrading Hive

Complete the following steps to upgrade Hive with manual steps. If you installed Hive with the MapR Installer, use the latest version of the MapR Installer to perform the upgrade.

Before you upgrade, make sure that the version of the MapR core software on your cluster supports the version of Hive you want to upgrade to. See the Hive Release Notes and the Interoperability Matrix.

1. Update Repository

MapR's rpm and deb repositories always contain the Hive version recommended for the latest release of the MapR core. You can connect to an internet repository or prepare a local repository with any version of Hive that you need. You can also manually download packages.

If you plan to install from a repository, complete the following steps each node where Hive is installed:
  1. Verify that the repository is configured correctly. See Preparing Packages and Repositories for information about setting up your ecosystem repository.
  2. Update the repository cache.
    • On RedHat/CentOS:
      yum clean all  
    • On SUSE:
      zypper refresh 
    • On Ubuntu:
      apt-get update 

2. Backup Configuration Files

Configuration files are located in /opt/mapr/hive/hive-<version>/conf/. If you have changed configuration properties, save configuration files to a backup location on all nodes where Hive is installed.

3. Upgrade Hive Packages

Use one of the following methods to upgrade the Hive components on all nodes where Hive is installed.

NOTE: Back up your metastore database before upgrading Hive.
To upgrade with a package manager:
After configuring repositories so that the version you want to install is available, you can use a package manager to install new packages from the repository.
  • On RedHat and CentOS:
    yum upgrade mapr-hive mapr-hiveserver2 mapr-hivemetastore
  • On Ubuntu:
    apt-get install mapr-hive mapr-hiveserver2 mapr-hivemetastore
  • On SUSE:
    zypper update mapr-hive mapr-hiveserver2 mapr-hivemetastore
To manually remove a prior version and manaully install the latest version in the repository:
Run the package manager twice, first to remove the old version, and again to install the new version.
  • On RedHat and CentOS:
    yum remove mapr-hive mapr-hiveserver2 mapr-hivemetastore 
    yum install mapr-hive mapr-hiveserver2 mapr-hivemetastore
  • On Ubuntu:
    apt-get remove mapr-hive mapr-hiveserver2 mapr-hivemetastore
    apt-get install mapr-hive mapr-hiveserver2 mapr-hivemetastore
  • On SUSE:
    zypper remove mapr-hive mapr-hiveserver2 mapr-hivemetastore
    zypper install mapr-hive mapr-hiveserver2 mapr-hivemetastore
To keep a prior version and install a newer version:
Hive installs into separate directories named after the version, such as /opt/mapr/hive/hive-<version>/, so the files for multiple versions can co-exist. To keep the prior version when installing a new version, you must manually install the package file for the new version.
  • On RedHat and CentOS:
    1. Download the RPM package file from http://package.mapr.com/releases/ecosystem-all/ ..
    2. Install the package with rpm.
      rpm -i --force mapr-hive-<version>.noarch.rpm
  • On Ubuntu:

    This process is not supported on Ubuntu, because apt-get and dpkg cannot manage multiple versions of a package with the same name.

  • On SUSE:
    1. Download the RPM package file from http://package.mapr.com/releases/ecosystem-all/ ..
    2. Install the package with rpm.
      rpm -i --force mapr-hive-<version>.noarch.rpm
NOTE: For upgrades from Hive 0.13-1504, Hive 1.0-1504, or earlier versions:If you want Warden to manage the WebHCat server, also install the mapr-hivewebhcat package on a node that already includes mapr-hive. As of the 1508 release of Hive 0.13, and Hive 1.x, the mapr-hivewebhcat package enables Warden to manage the WebHCat server.

4. Updating the Hive Metastore

Before starting the new version of Hive, you must update the Hive Metastore to work with the new version. If you do not do this, the metastore may become corrupted.

  1. Refer to the README file in the /opt/mapr/hive/hive-<version>/scripts/metastore/upgrade/<metastore_database> directory for directions on updating your existing metastore_db schema to work with the new Hive version.
    TIP: When you complete the step to run the schema upgrade scripts, run the following scripts:
    For upgrades from Hive 0.13 to 1.0:
    1. upgrade-0.13.0-to-0.14.0.<metastore_database>.sql
    2. upgrade-0.14.0-to-1.1.0.<metastore_database>.sql
    For upgrades from Hive 0.13 to 1.2.1:
    1. upgrade-0.13.0-to-0.14.0.<metastore_database>.sql
    2. upgrade-0.14.0-to-1.1.0.<metastore_database>.sql
    3. upgrade-1.1.0-to-1.2.0.<metastore_database>.sql
    For upgrades from Hive 1.0 to 1.2.1
    • upgrade-1.1.0-to-1.2.0.<metastore_database>.sql
    NOTE: Run the metastore upgrade scripts from the /opt/mapr/hive/hive-<version>/scripts/metastore/upgrade/<metastore_database> directory. The script sources files from this directory. If you run the script from another location, it will fail.
  2. Verify that the metastore database update completed successfully. For example, use these diagnostic tests:
    • Run the show tables command in Hive and make sure it returns a complete list of all your Hive tables.
    • Perform simple SELECT operations on Hive tables that existed before the upgrade.
    • Perform filtered SELECT operations on Hive tables that existed before the upgrade.

5. Migrate Hive Configuration

When you upgrade to a newer Hive version, a hive-<version> folder is created and the old configuration files will not be automatically migrated to the new folder. Therefore, if you saved configuration files to a backup location, migrate the custom configuration settings to the configuration files within the conf directory (/opt/mapr/hive/hive-<version>/conf/).

NOTE: When you upgrade to more recent MapR package of an existing Hive version, a new hive-<version> folder is not created and existing configuration files should remain untouched by the upgrade process.

6. Start Hive Services

Restart the Hive Metastore and Hiveserver2.
To start Hive Metastore using the maprcli:
  1. Make a list of nodes on which Hive Metastore is configured.
  2. Issue the maprcli node services command:
    maprcli node services -name hivemeta -action start -nodes <space delimited list of nodes>
To start Hiveserver2 using the maprcli:
  1. Make a list of nodes on which Hiveserver2 is configured.
  2. Issue the maprcli node services command:
    maprcli node services -name hs2 -action start -nodes <space delimited list of nodes> 

7. Run configure.sh

If you installed the mapr-hivewebhcat package, run configure.sh with the -R option on the node where you installed the mapr-hivewebhcat package.
/opt/mapr/server/configure.sh -R
This step enables Warden to recognize the newly installed service.