OpenStack Sahara (Ocata)
OpenStack is open source software for creating private and public clouds. This software provides a complete technology stack equivalent to those provided by major public cloud services, including management of virtual machines, storage, and networking. Sahara is the Hadoop data processing module within OpenStack, and provides a solution for users who want to deploy Hadoop clusters or run big data applications in a cloud environment.
The MapR Plugin for OpenStack Sahara provides an installation and deployment implementation so that administrators of OpenStack environments and end users can set up clusters that use the MapR Distribution for Hadoop with Sahara. You can use the plugin to satisfy two main use cases:
- On-demand provisioning of cloud-based MapR clusters
- Running individual MapReduce jobs in a MapR cluster
The MapR plugin performs the following four primary functions during cluster creation:
- MapR components deployment - The plugin manages the deployment of the required software to the target VMs.
- Services Installation - MapR services are installed according to provided roles list.
- Services Configuration - The plugin combines default settings with user provided settings.
- Services Start - The plugin starts appropriate services according to specified roles.
The OpenStack Swift component provides object storage capabilities, and integrates with OpenStack Sahara to allow large-scale input and output data for big data applications.
See the Sahara Installation Guide for details about installing OpenStack and Sahara; these installation procedures are outside the scope of this document, which covers only the installation and use of the MapR plugin within an existing operational OpenStack/Sahara environment.
The MapR Plugin for Sahara (Mitaka release) is supported in the following environments:
MapR Distribution for Hadoop (Community Edition)
5.2.0, 5.1.0 (deprecated)
Supported operating systems
Ubuntu 14.04, CentOS 6.x and CentOS 7
See also the Ecosystem Support Matrix for details about specific versions of ecosystem components that are supported in an OpenStack deployment of MapR.
The MapR Plugin for Sahara is not aware of MapR editions and licensing options. Every cluster you deploy in an OpenStack environment is a Community Edition (M3) of MapR unless you apply a license to the cluster manually once it is up and running.
Plugin Installation and Administration
Administrators are responsible for installing and configuring the MapR Plugin for Sahara in an operational OpenStack environment, as well as importing machine images for MapR instances. OpenStack users then set up node group and cluster definitions so they can deploy clusters in the cloud and run jobs.
- OpenStack Administrator
- Download the plugin and add it to the Sahara system.
- Enable the plugin by editing the Sahara configuration.
- Start Sahara with the new configuration in effect.
- Add prebuilt images to OpenStack for specific operating systems and MapR versions. You can also build your own image by following the instructions in the README file for the plugin.
- OpenStack User
- Define node group and cluster templates (and register specific images with the MapR plugin).
- Deploy clusters based on templates.
- Run jobs on deployed clusters.
The following sections explain these steps in detail. Many of these tasks are done via the OpenStack dashboard (UI).