Including the MapR-FS JAR in Applications

In general, applications should not bundle the MapR-FS JAR file. However, you can bundle the MapR-FS JAR file when an application meets certain requirements.

In many cases, nodes running applications with a bundled MapR-FS JAR file may run out of memory or shut down unexpectedly. These errors generally occur when there is binary mismatch between the bundled JAR file and the version that the cluster expects.

Requirements

You can bundle the MapR-FS JAR (maprfs-<version>-mapr.jar) with applications that meet all of the following requirements:
  • The application communicates directly with the MapR-FS or MapR-DB.
  • The application does not run as a MapReduce or YARN job/application on the cluster.
  • The application does not include MapR-FS JARs on the local machine in its classpath.
  • The application accesses a cluster that is not secure.

Using Maven to Include MapR-FS JAR as a Dependency

If you use Maven to bundle the MapR-FS JAR file with an application and you plan to run the application on a MapR cluster where a patch has been applied, ensure that you specify both a system scope and a local system path to the file.

For example, to bundle the MapR-FS 5.1 JAR file, the pom.xml file may include the following:
...
 <groupId>com.mapr.hadoop</groupId>
        <artifactId>maprfs</artifactId>
        <version>${mapr.core.version}</version>
        <scope>system</scope>
        <systemPath>/opt/mapr/lib/maprfs-5.1.0-mapr.jar</systemPath>
...

By default, MapR's Maven repository includes JAR files from http://repository.mapr.com/maven/. This default Maven repository includes JAR files associated with the GA packages for each MapR release. Therefore, when a patch has been applied to the cluster, failure to specify a system scope may result in errors due to a binary mismatch between the MapR-FS JAR files used by the application and the cluster.

Known Issues

Nodes running applications with a bundled MapR-FS JAR file may run out of memory or shut down unexpectedly in the following scenarios:
The version of the MapR-FS JAR included in the application differs from the version that is available on the cluster.
This may occurs when that a patch was applied to some, but not all the nodes in the cluster. It can also occur when Maven is bundling the GA version of the JAR file when the cluster expects a newer, patched version.
Two versions of the JAR are available on the node.
For YARN or MapReduce V1 applications, the TaskTracker or NodeManager nodes that run the tasks or containers store local versions of the dependencies included with the application. In this scenario, since both the cluster’s MapR-FS JAR and the version included in the application are available on the node, it is unknown which JAR will be used when processing the application.