Install Oozie

About this task

The following procedures use the operating system package managers to download and install Oozie from the MapR Repository. To install the packages manually or for information about configuring the repository, refer to Preparing Packages and Repositories.

Oozie's client/server architecture requires you to install two packages, mapr-oozie and mapr-oozie-internal, on the client node and the server node. mapr-oozie is dependent on mapr-oozie-internal. mapr-oozie-internal is automatically installed by the package manager when you install mapr-oozie.

Execute the following commands as root or using sudo on a MapR cluster:

Procedure

  1. Update the list of available packages:

    Ubuntu:

    apt-get update

    SUSE:

     zypper refresh

    RedHat/CentOS:

    yum clean all
  2. Install mapr-oozie and mapr-oozie-internal:
    RedHat/CentOS
    yum install mapr-oozie
    SUSE
    zypper install mapr-oozie
    Ubuntu
    apt-get install mapr-oozie 
  3. For non-secure clusters, add the following two properties to core-site.xml located in /opt/mapr/hadoop/hadoop-2.x.x/etc/hadoop/core-site.xml:
    <property>
      <name>hadoop.proxyuser.mapr.hosts</name>
      <value>*</value>
    </property>
    <property>
      <name>hadoop.proxyuser.mapr.groups</name>
      <value>*</value>
    </property> 
  4. Restart the Warden deamon to reinitialize core cluster services:
    # sudo service mapr-warden restart 
  5. Export the Oozie URL to your environment with the following command:
    export OOZIE_URL='http://<Oozie_node>:11000/oozie'
  6. Check Oozie’s status with the following command:
    # /opt/mapr/oozie/oozie-<version>/bin/oozie admin -status
  7. If you are running Oozie jobs on YARN, perform the following steps:
    1. If high availability for the Resource Manager is configured, edit the job.properties file for each workflow and insert the following statement
      JobTracker=maprfs:///
    2. If high availability for the Resource Manager is not configured, provide the address of the node running the active ResourceManager and the port used for ResourceManager client RPCs (port 8032). For each workflow, edit the job.properties file and insert the following statement:
      JobTracker=<ResourceManager_address>:8032
    3. Restart Oozie:
      maprcli node services -name oozie -action restart -nodes <space delimited list of nodes>
      NOTE: If high availability for the Resource Manager is not configured and the ResourceManager fails, you must update the job.properties with the active ResourceManager.