MapR File System
The MapR Data Platform provides a unified data solution for structured data (tables) and unstructured data (files).
MapR File System (MapR-FS) is a random read-write distributed file system that allows applications to concurrently read and write directly to disk. The Hadoop Distributed File System (HDFS), by contrast, has append-only writes and can only read from closed files. Because HDFS is layered over the existing Linux file system, a greater number of input/output (I/O) operations decrease the cluster’s performance. MapR-FS also eliminates the Namenode associated with cluster failure in other Hadoop distributions, and enables special features for data management and high availability.
The storage system architecture used by MapR-FS is written in C/C++ and prevents locking contention, eliminating performance impact from Java garbage collection.
|Storage pools||A group of disks that MapR-FS writes data to.|
|Containers||An abstract entity that stores files and directories in MapR-FS. A container always belongs to exactly one volume and can hold namespace information, file chunks, or table chunks for the volume the container belongs to.|
|CLDB||A service that tracks the location of every container.|
|Volumes||A management entity that stores and organizes containers. Used to distribute metadata, set permissions on data in the cluster, and for data backup. A volume consists of a single name container and a number of data containers.|
|Direct Access NFS||Enables applications to read data and write data directly into the cluster.|
|POSIX Clients||The loopbacknfs and FUSE-based POSIX clients connect to one or more MapR clusters and allow app servers, web servers, and applications to write data directly and securely to the MapR cluster.|