About Release 7.6.1
This site contains documentation for HPE Ezmeral Data Fabric release 7.6.1, including installation, configuration, administration, and reference content, as well as content for the associated ecosystem components and drivers.
7.6.1 Installation
This section contains information about installing and upgrading HPE Ezmeral Data Fabric software. It also contains information about how to migrate data and applications from an Apache Hadoop cluster to a HPE Ezmeral Data Fabric cluster.
7.6.1 Data Fabric
HPE Ezmeral Data Fabric is the industry-leading data platform for AI and analytics that solves enterprise business needs.
7.6.1 Administration
This section describes how to manage the nodes and services that make up a cluster.
7.6.1 Development
This section contains information related to application development for Ezmeral ecosystem components and HPE Ezmeral Data Fabric products, including the file system, Database (Key-Value and JSON), and Event Streams.
- Application Development Process
  Before you start developing applications on the HPE Ezmeral Data Fabric platform, consider how you will get the data into the platform, the storage format of the data, the type of processing or modeling that is required, and how the data will be accessed.
- File Store and Apps
  The following sections provide information about accessing the File Store with C and Java applications.
- HPE Ezmeral Data Fabric Database and Apps
  This section contains information about developing client applications for JSON and key-value tables.
- Apache Kafka Wire Protocol Service
  HPE Ezmeral Data Fabric Streams supports Apache Kafka Wire Protocol Service. Apache Kafka Wire Protocol Service is a TCP/IP service that emulates a Kafka cluster backed by HPE Ezmeral Data Fabric Streams. The service makes it possible for Apache Kafka clients written in any programming language to access topics in HPE Ezmeral Data Fabric Streams.
- HPE Ezmeral Data Fabric Streams and Apps
  HPE Ezmeral Data Fabric Streams brings integrated publish and subscribe messaging to HPE Ezmeral Data Fabric.
- MapReduce and Apps
  This section contains information associated with developing YARN applications.
  - External Applications and Classpath
    Describes how to configure the class path for external applications.
  - Classpath Construction
    This section describes how the MapReduce classpath is constructed.
  - Managing Third-Party Libraries
    Any third-party library that is required by a MapReduce program must be accessible to the data node that processes the application.
- Kubernetes Interfaces for Data Fabric
  This section describes how to leverage the capabilities of the Kubernetes Interfaces for Data Fabric.
- Ecosystem Components
  The following sections provide information about each open-source project that is supported by the HPE Ezmeral Data Fabric.
- Maven and the HPE Ezmeral Data Fabric
  This section discusses topics associated with Maven and the HPE Ezmeral Data Fabric.
- Developer's Reference
  This section contains in-depth information for the developer.
- API Documentation
  HPE Ezmeral Data Fabric supports public APIs for file system, HPE Ezmeral Data Fabric Database, and HPE Ezmeral Data Fabric Streams. These APIs are available for application-development purposes.
Other Docs
This section contains release-independent information, including: Installer documentation, Ecosystem release notes, interoperability matrices, security vulnerabilities, and links to other data-fabric version documentation.
Glossary
Definitions for commonly used terms in MapR Converged Data Platform environments.

Managing Third-Party Libraries

Any third-party library that is required by a MapReduce program must be accessible to the data node that processes the application.

A data node is a node in the cluster that includes the NodeManager role. You can provide the third-party libraries when you submit the program, or you can install the third-party libraries on each node that processes the application.

Include the third-party libraries with each program

Including the third-party libraries with each program is the preferred method.

Perform one the following operations to include the third-party jars when you submit the program:

Package the third-party libraries with the MapReduce jar file. The benefit of this method is that the node from which you submit the program and the node that runs the program are not required to have the libraries files.
Use the -libjars parameter to specify the third-party libraries on the command line. With this option, the library files are submitted to the data node along with the program. The benefit of this method is that the node that runs the program does not need to have the library files installed. However, the node that submits the program must have the library files installed.

Install the third-party libraries on each node that runs the program

You can also install the third-party libraries on each data node. However, this may not be preferred as there could be conflicts between library versions or library files.

To install the third-party libraries on each data node, perform one of the following operations:

Install the third-party libraries in the following directory on each Node Manager node: /opt/mapr/hadoop/hadoop-2.x/share/hadoop/common
On each node with the NodeManager role, install the required third-party libraries and then specify the location(s) of the third-party libraries with the HADOOP_CLASSPATH env variable in the env_override.sh file. The env_override.sh file is located in the following directory: /opt/mapr/conf. For more information about the file, see About env_override.sh.