Security Architecture
- Communication between the nodes in the cluster is encrypted:
- HBase traffic is secured with Kerberos.
- NFS traffic between the server and cluster, traffic within the MapR-FS, and CLDB traffic is encrypted with secure MapR RPCs.
- Traffic between JobClients, TaskTrackers, and JobTrackers is secured with MAPRSASL, an implementation of the Simple Authentication and Security Layer framework.
- Support for Kerberos user authentication.
- Support for Kerberos encryption for secure communication to open source components that require it.
- Support for the Simple and Protected GSSAPI Negotiation Mechanism (SPNEGO) used with the web UI front ends of some cluster components.
Authentication Architecture: The maprlogin Utility
- Explicit User Authentication
- When you explicitly generate a ticket, you have the option to authenticate with your
username and password or authenticate with Kerberos:
- The user invokes the maprlogin utility, which connects to a CLDB node in the
cluster using HTTPS. The hostname for the CLDB node is specified in the
mapr-clusters.conf file.
- When using username/password authentication, the node authenticates using PAM modules with the Java Authentication and Authorization Service (JAAS). The JAAS configuration is specified in the mapr.login.conf file. The system can use any registry that has a PAM module available.
- When using Kerberos to authenticate, the CLDB node verifies the Kerberos principal with the keytab file.
- After authenticating, the CLDB node uses the standard UNIX APIs
getpwnam_r
andgetgrouplist
, which are controlled by the/etc/nsswitch.conf
file, to determine the user's user ID and group ID. - The CLDB node generates a ticket and returns it to the client machine.
- The server validates that the ticket is properly encrypted, to verify that the ticket was issued by the cluster's CLDB.
- The server also verifies that the ticket has not expired or been blacklisted.
- The server checks the ticket for the presence of a privileged identity such as the mapr user. Privileged identities have impersonation functionality enabled.
- The ticket's user and group information are used for authorization to the cluster, unless impersonation is in effect.
- The user invokes the maprlogin utility, which connects to a CLDB node in the
cluster using HTTPS. The hostname for the CLDB node is specified in the
mapr-clusters.conf file.
- Implicit Authentication with Kerberos
- On clusters that use Kerberos for authentication, a MapR ticket is implicitly
obtained for a user that that runs a MapR command without first using the
maprlogin
utility. The implicit authentication flow for the maprlogin utility first checks for a valid ticket for the user, and uses that ticket if it exists. If a ticket does not exist, themaprlogin
utility checks if Kerberos is enabled for the cluster, then checks for an existing valid Kerberos identity. When themaprlogin
utility finds a valid Kerberos identity, it generates a ticket for that Kerberos identity.
Authorization Architecture: ACLs and ACEs
An Access Control List (ACL) is a list of users or groups. Each user or group in the list is paired with a defined set of permissions that limit the actions that the user or group can perform on the object secured by the ACL. In MapR, the objects secured by ACLs are the job queue, volumes, and the cluster itself.
A job queue ACL controls who can submit jobs to a queue, kill jobs, or modify their priority. A volume-level ACL controls which users and groups have access to that volume, and what actions they may perform, such as mirroring the volume, altering the volume properties, dumping or backing up the volume, or deleting the volume.
An Access Control Expression (ACE) is a combination of user, group, and role definitions. A role is a property of a user or group that defines a set of behaviors that the user or group performs regularly. You can use roles to implement your own custom authorization rules. ACEs are used to secure MapR tables that use native storage.
Encryption Architecture: Wire-Level Security
MapR uses a mix of approaches to secure the core work of the cluster and the Hadoop components installed on the cluster.
- The FileServer, JobTracker, and TaskTracker use MapR tickets to secure their remote procedure calls (RPCs) with the native MapR security layer. Clients can use the maprlogin utility to obtain MapR tickets. Web UI elements of these components use password security by default, but can also be configured to use SPNEGO.
- HiveServer2, Flume, and Oozie use MapR tickets by default, but can be configured to use Kerberos.
- HBase and the Hive metaserver require Kerberos for secure communications.
- The MCS Web UI is secured with passwords. The MCS Web UI does not support SPNEGO for users, but supports both password and SPNEGO security for REST calls.
Servers must use matching security approaches. When an Oozie server, which supports MapR Tickets and Kerberos, connects to HBase, which supports only Kerberos, Oozie must use Kerberos for outbound security. When servers have both MapR and Kerberos credentials, these credentials must map to the same User ID to prevent ambiguity problems.
Security Protocols Used by MapR
Protocol | Encryption | Authentication |
---|---|---|
MapR RPC | AES/GCM | maprticket |
Hadoop RPC and MAPRSASL | MAPRSASL | maprticket |
Hadoop RPC and Kerberos | Kerberos | Kerberos ticket |
Generic HTTP Handler | HTTPS using SSL/TLS | maprticket, username and password, or Kerberos SPNEGO |
Security Protocols Listed by Component
Component | Protocols Used |
---|---|
CLDB | Outbound: MapR RPC Inbound: Custom HTTP handler for the maprlogin utility, which supports authentication through username and password or Kerberos. |
MapR-FS | MapR RPC |
Task and Job Trackers | Hadoop RPC and MAPRSASL. Traffic to the MapR file system uses MapR RPC. |
HBase | Inbound: Hadoop RPC and Kerberos Outbound: Hadoop RPC and Kerberos. Traffic to the MapR file system uses MapR RPC. |
Oozie | Inbound: Generic HTTP Handler by default, configurable for HTTPS using
SSL/TLS Outbound: Hadoop RPC and MAPRSASL by default, configurable to replace MAPRSASL with Kerberos. Traffic to the MapR file system uses MapR RPC. |
NFS | Inbound: Unencrypted NFS protocol Outbound: MapR RPC |
Flume | Inbound: None Outbound: Hadoop RPC and MAPRSASL by default, configurable to replace MAPRSASL with Kerberos. Traffic to the MapR file system uses MapR RPC. |
HiveServer2 | Inbound: Thrift and Kerberos, or username/password over SSL. Outbound: Hadoop RPC and MAPRSASL by default, configurable to replace MAPRSASL with Kerberos. Traffic to the MapR file system uses MapR RPC. |
Hive Metaserver | Inbound: Hadoop RPC and Kerberos. Traffic to the MapR file system uses MapR RPC. |
MCS | Inbound: User traffic is secured with HTTPS using SSL/TLS and username/password. REST traffic is secured with HTTPS using SSL/TLS with username/password and SPNEGO. Web UIs Generic HTTP handler. Single sign-on (SSO) is supported by shared cookies. |