Issues Resolved in 5.2.2

Issues resolved in 5.2.2 release

The following customer-reported issues which were observed in version 5.2.0 or 5.2.1 are resolved in Version 5.2.2.

Put issues list here

Product Number Description Resolution
CLDB 23685 When an SP went offline, it was not immediately reflected in the "Number of Storage Pools Offline" metrics because it took about 60 minutes for CLDB to declare an SP as offline. With this fix, the "Number of Storage Pools Offline" metric is updated within 2 minutes of a storage pools stopping heartbeating. Newly introduced "Number of Storage Pools To Rereplicate" metric is updated only after an hour to reflect CLDB view of offline SPs.
CLDB 24885 In rare cases, querying for information about the nodes in the cluster, along with removing nodes in the cluster, caused the CLDB to shutdown or failover. Concurrent processing of the above operations is now handled correctly without causing the CLDB to fail.
CLDB 25800 Restoring mapr.cldb.internal volume from a dump fails. CLDB restore from volume dump is successful.
CLDB 26335

When a volume was deleted, the associated snapshots were also deleted; but the snapcids in CLDB were not deleted immediately.

With this fix, CLDB will now purge all snapshots when a volume is deleted.

CLDB 26705 If dump replicationmanagerququeinfo command is run when CLDB is checking for under-replicated containers, CLDB would sometimes crash (resulting in a failover) because of a race condition. With this fix, CLDB will no longer crash if dump replicationmanagerququeinfo command is run.
CLDB 27298

CLDB crashed with NPE when the fileserver reported an empty feature list to CLDB.

With this fix, CLDB will no longer crash when fileserver reports empty feature list.

CLDB 27475 Sometimes CLDB was marking a container invalid immediately after it asked it to become master. With this fix, CLDB will no longer mark a container invalid after asking it to become master.
CLDB 28002 Sometimes, CLDB threw an exception and failed over during schedule update because of a race condition. With this fix, CLDB will no longer fail over during schedule update operation.
Configuration/FS 26558

NodeManager fails to start correctly with the error "true: integer expression expected" due to an invalid string-comparison operator in createTTVolume.sh.

createTTVolume.sh now uses the correct string comparison operator and NodeManager starts correctly.

DB 26714

MapR-DB returns expired rows while retrieving from value cache.

With this fix, expired rows no longer return.

DB 27018 Jobs performing put operations and locking keys could get stuck. Put operations work properly
DB 27300 During a bulk load into a MapR-DB table when at least one snapshot of the volume where the table resides, the bulk load could hang indefinitely. With this fix, the bulk load to the MapR-DB table will complete. The load may be delayed up to two minutes, but it will recover automatically.
DB 28209 A GET call on a binary table copy returns rows that had been deleted from the copy of the table. GETs against binary table copies work correctly and do not return rows that were previously deleted.
DB 28511 MFS log shows continuous high mfs memory usage alarms and MFS restarts. With this fix, system runs normally and high mfs memory usage alarms are not raised.
DB-JSON 27705 Multi-threaded applications doing inserts using the OJAI API to write to JSON tables at high frequencies periodically could hang, lead to a core dump with mapr::fs::DbClntPutBuffer::shouldFlush , and need to be killed. With 5.2.2, inserts under this type of workload no longer cause client applications to crash or hang.
DB-JSON 27878 Performing an _id projection with a non-existing id could result in MapR-DB ceasing to work. Non-existing _id projection will no longer cause MapR-DB failures.
DB-JSON 28378 For tables created with a non-default columnFamily on an array field path, find and findById with acondition on array element does not return the correct result. With this fix, findby and findbyid work correctly with a non-default column family being an array field path.
Fileclient 26781 The hadoop mfs –rmr command was causing a crash if there were parallel deletes/rmdir commands being executed while the rmr command was getting executed With this fix, the command will not cause a crash.
FileClient 27757 YARN Distributed shell application fails when a RM failover happens RM failover no longer causes YARN distributed shell applications to fail
FileClient 27940 User credentials were not set correctly in the read and adviseFile APIs from Java client resulting in Drill queries failing intermittently with permission denied errors in the logs. With this fix, user credentials will be set correctly in the read and adviseFile APIs from Java client.
Fileserver 24898

When bringing an SP online, MFS crashed because of memory double free issue.

With this fix, MFS will not crashing when bringing an SP online.

FileServer 25538 In case of simultaneous failures on multiple nodes, in some very rare circumstances, concurrent seattr operation resulted in inconsistent attributes being propagated. With this fix, the race condition is properly handled.
FileServer 27223 CLDB crashes every 4-6 hours with core dump. System behaves correctly
FileServer 27440

The chown command was failing on immutable directories if groups were not specified with the command, even when the command was run by the owner of the directory.

With this fix, the chown command will complete successfully when run by the owner with or without the groups information.

FS-Fuse 24595 The FUSE client stores libMapRClient in /tmp. The location for MapR libraries is now configurable in fuse.conf file. See "Configuring the MapR FUSE-Based POSIX Client".
FS-Fuse 26570 The gethostbyname() call was returning h_addr_list as NULL causing MFS to crash. In this patch, the obsolete gethostbyname API has been replaced with the getaddrinfo API.
FS-Fuse 26757 If there was data mutation (but not the length) resulting in the file size being the same, but mtime was different, the FUSE kernel was consuming data from page cache instead of initiating a new READ operation. With this fix, on kernels greater than version 3.6, the metadata change will invalidate the kernel page cache and trigger the POSIX client to initiate a new READ operation.
FS-Fuse 26961 The FUSE-based POSIX client crashed when file was renamed with sticky set on parent directory. With this fix, the FUSE-based POSIX client will allow file rename when sticky bit is set on parent directory.
FS-Fuse 26991 Read requests beyond the file size from the POSIX client was being sent to incorrect file descriptor.

With this fix, read requests beyond the file size will not go to incorrect file descriptor.

FS:Mirror2RW 27605 Mirror stop fails to stop mirror when source cluster is down. With this fix, request to stop mirror works correctly.
FS:resync 26092 and 26096 The snapshots created for resync (or mirroring operation) were not getting deleted after resync.

With this fix, snapshots created for resync operation will be deleted after resync operation.

Installer:Stanzas IN-315

Ubuntu upgrade using stanzas fails from 5.1 to 5.2.0 or 5.2.1.

Ubuntu upgrades from 5.1 to 5.2.x work correctly.
MapR Streams 26966 Unable to delete a connector when using Kafka Connect for MapR Streams in distributed mode. Kafka Connect API indicated that the connector has been deleted, but the connector was still listed as an active connector. Able to delete a connector: DELETE /connectors/(string:name)/ executes correctly.
MapR Streams 25747 Distributed mode was not available in version 2.0.1-1611 of Kafka Connect for MapR Streams. Distributed mode is available for MEP 2.0.1 and MEP 3.0.0.
Multi-MFS 22964 Multi-MFS allowed the disk balancer to place multiple replicas of one container on the same node, resulting in the loss of multiple copies during a node failure. Multiple replicas of one container are not allowed on the same node, providing better fault tolerance during node failures.
RPC 26718 On a multi-NIC cluster, sometimes resync operation failed with intermittent errors due to multiple RPC bindings from the same host.

With this fix, resync will no longer fail on a multi-NIC cluster.

Yarn 26839 Network issues can cause the ResourceManager (RM) web UI to hang until it is restarted. With this fix, the RM web UI no longer causes an outage of the web UI.
Yarn 26898 The yarn queue -status <queuename> command does not show information about the queue labels because certain Apache properties for retrieving the information are not supported. With this fix, the MapR "Label" and "Label Policy" properties are provided as outputs to the yarn queue -status <queuename> command in lieu of the unsupported Apache properties.

Issues resolved in 5.2.1 release

The following customer-reported issues which were observed in version 5.2.0 are resolved in Version 5.2.1.

Product Number Description Resolution
AWS SDK jar 24566 An older version of the aws-sdk jar was built with Mapr. With this fix, MapR upgraded the aws-sdk jar from version 1.7.4 to 1.7.15
Build 24992 Installing a MapR patch caused jar files to be removed from under the drill/drill-1.4.0/jars/ directory Jar files are no longer incorrectly removed
CLDB 14105 When nodes attempt to register with duplicate IDs, CLDB does not register the nodes and log meaningful error messages. With this fix, when nodes attempt to register with duplicate IDs, CLDB will log appropriate error messages.
CLDB 24413 CLDB was crashing when volume replication was greater than 3. With this fix, CLDB will not crash when volume replication factor is greater than 3
CLDB 24647 On a node with multiple host IDs, CLDB crashed and failed over to a new CLDB when a stale host ID was removed. With this fix, CLDB will not crash and fail over when a stale host ID is removed.
CLDB 24651 CLDB threw an exception and failed over when the snapshots list was iterated over while snapshots were being created. With this fix, CLDB will no longer fail over when snapshots list is iterated over while new snapshots are being created.
CLDB 24662 Intermittently, CLDB was shutting down because of race between initialization and use of license. With this fix, the license will be completely initialized before being used.
CLDB 24770 Under high load, sometimes CLDB would be caught up in a deadlock when updating volume info and volume snapshot count simultaneously. With this fix, there will no longer be a deadlock when updating volume info and volume snapshot count.
CLDB 25708 After a rolling upgrade of certain nodes to 4.1 or later, operations from these nodes to nodes running 4.0.2 and prior versions of MapR were getting stalled because MapR 4.0.2 and older versions did not process the new RPC introduced with MapR 4.1. With this fix, operations on nodes running MapR 4.0.2 will not be stalled.
CLDB 26214 During rolling upgrade, if slave CLDB is upgraded before master CLDB, the slave CLDB may crash when accessing new KvStore tables. With this fix, slave CLDB will not crash on reading new tables, even if they are non-existent.
CLDB 26335 When snapshots were deleted as part of volume remove, CLDB tables, which store snapshot info, were not purged at a fast rate. As a result of this, the cid 1 container grew in size gradually. With this fix, snapshot tables will now be purged properly when volumes are removed and the size of cid 1 container will not grow.
DB 24745 An assertion failure occurs in MapR-FS due to zero (0) length field names in OJAI documents. With this fix, the assert failure will no longer occur.
DB 24807 The run time of MapR tasks with counters is slow for file output commits. Because MapR-DB uses time-based trigger for bucket flushing, any unused buckets for 5 mins are flushed. These unused buckets are flushed every 2 sec in batches of 12. If there are a lot of buckets, regardless of size, that are unused for that period of time, the load caused by the flushing impacts performance. The time-based bucket flush was disabled, preventing slow performance.
DB 25241 In MapR-DB, when using the HBase Java FuzzyRowFilter filter, the wrong result is returned. this occurs because the mask preprocessing converts 1 to 2 and 0 to -1. With this fix, the correct results are returned.
DB 25333 An exception occurs when a JSON document is re-inserted into the same row after a table’s time-to-live has expired. With this fix, the exception will no longer occur and re-insertion will complete successfully.
DB 25401 The cumulative cost becomes a negative value when a MapR-DB table has more than 2147483647 rows. With this fix, the return type for the Java getNumRows() API is changed to long and the correct value is preserved.
DB-JSON 26338 When using an "In" condition for QueryCondition API on only the _id field, the last record is omitted. With this fix, all records are returned.
DB-Marlin 24408 When running multiple producers as separate threads within a process, with a very small value for buffer.memory (say 1KB), some producers can stall. This is due to a lack of buffer memory. With this fix, the default value for minimum buffer memory is increased to 10kB.
FileClient 24053 During client initialization, the client crashed if there was an error during initialization With this fix, the client will not crash if there is an error during initialization.
FileClient 25471 The readdir operation was returning incorrect entries when the child entries were volumes because of an issue with volume attributes on the client side With this fix, volume attributes will be set correctly for lookup and readdir operations.
MapR-FS 12856 When the hadoop fs -rmr command is run, it reads entire directory contents into memory before starting to delete anything resulting in Out Of Memory error. This fix includes a new hadoop mfs -rmr path command that:
  • Will not build entire readdir file list in memory and once 1MB of readdir data is reached, the command will unlink and remove those directories.
  • Will not fetch the attributes of the entries in readdir.
MapR-FS 20644 Sometimes, when mirroring large number of containers, the volume mirror thread was crashing resulting in a CLDB failover. With this fix, the mirroring process will be resilient to large number of containers.
MapR-FS 22044 The CLDB logs were growing to a large size with stdout and stderr messages when a user's ticket expired. With this fix, the CLDB logs will not grow to a large size with stdout and stderr messages when a user's ticket expires because the log level of messages related to ticket expiration has now been changed to Debug.
MapR-FS 23652 The POSIX loopbacknfs client did not automatically refresh renewed service tickets. With this fix, the POSIX loopbacknfs client will:
  • Automatically use the renewed service ticket without requiring a restart if the ticket is replaced before expiration (ticket expiry time + grace period of 55 minutes). If the ticket is replaced after expiration (which is ticket expiry time + grace period of 55 minutes), the POSIX loopbacknfs client will not refresh the ticket as the mount will become stale.
  • Allow impersonation if a service ticket is replaced before ticket expiration (which is ticket expiry time + grace period of 55 minutes) with a servicewithimpersonation ticket.
  • Honor all changes in user/group IDs of the renewed ticket.
MapR-FS 23975 In version 5.1, MFS was failing to start on some docker containers as it was trying to figure out number of numa nodes from /sys/devices/system/node. With this fix, MFS will work on docker containers.
MapR-FS 24022 Mirroring of a volume on a container which does not have a master container caused the mirror thread to hang. With this fix, mirroring will not hang when the container associated with the volume has no master.
MapR-FS 24139 If limit spread was enabled and the nodes were more than 85% full, CLDB did not allocate containers for IOs on non-local volumes. With this fix, CLDB will now allocate new containers to ensure that the IO does not fail.
MapR-FS 24155 Disk setup was timing out if running trim on flash drives took some time. With this fix, disk setup will complete successfully and the warning message (“Starting Trim of SSD drives, it may take a long time to complete”) is entered in the log file.
MapR-FS 24159 The mtime was updated whenever a hard link was created. Also, when a hard link was created from the FUSE mount point, although the ctime was updated, the update timestamp only showed the minutes and seconds and not the nanoseconds. With this fix, mtime will not change on the hard link and when a hard link is created from the FUSE mount point, the timestamp for ctime will include nanoseconds.
MapR-FS 24249 When running map/reduce jobs with older versions of the MapR classes, a system hang or other issues occurred because the older classes linked to the native library installed on cluster nodes that were updated to a newer MapR version With this fix, the new fs.mapr.bailout.on.library.mismatch parameter detects mismatched libraries, fails the map/reduce job, and logs an error message. The parameter is enabled by default. You can disable the parameter on all the TaskTracker nodes and resubmit the job for the task to continue to run. To disable the parameter, you must set it to false in the core-site.xml file.
MapR-FS 24352 Mirror synchronization is not optimized. In this patch, mirror synchronization has been optimized for changes in a small percentage of the inodes. During mirror resync operation, the destination will send the recent version number from the last mirror resync operation. While scanning inodes to identify the inodes that have changed since the last resync operation, MFS will now compare the version number sent by the destination with the allocation group, which keeps track of all the inodes. If the allocation group version is:
  • Higher than the last resync version, then MFS will check for the changed inodes in the allocation group.
  • Less than or equal to the last resync version, MFS will not read all the inodes in the allocation group because the allocation group has not changed since the last resync operation.
MapR-FS 24585 Excessive logging in CLDB audit caused cldbaudit.log file to grow to large sizes. With this fix, to reduce the size of cldbaudit.log file, the queries to CLDB for ZK string will no longer be logged for auditing.
MapR-FS 24618 Remote mirror volumes could not be created on secure clusters using MCS even when the appropriate tickets were present. With this fix, remote mirror volumes can now be created on secure clusters using MCS.
MapR-FS 24630 Under some conditions, using the 'ls' command with --full-time option produced incorrect results that showed as a negative number. With this fix, the correct timestamp is supplied.
MapR-FS 24660 MFS crashed because the maximum number of slots for backgrounded delete operations was not adequate. The incoming client operations reserving these slots were hanging and causing MFS to crash. With this fix, MFS will not crash as the number of slots for background operations has been increased.
MapR-FS 24712 During container resynchronization, the same scratch space was being reused by internal parallel operations resulting in corruption. With this fix, internal parallel operations will use separate scratch spaces.
MapR-FS 24846 If the topology of a node changed, after a CLDB failover, the list of nodes under a topology could not be determined as the new non-leaf topologies were not being updated. With this fix, the inner nodes of topology graph will be updated correctly and the list of nodes under an inner (non-leaf) topology will be determined correctly.
MapR-FS 24915 Running the expandaudit utility on volumes can result in very large (more than 1GB) audit log files due to incorrect GETATTR (get attributes) cache handling. With this fix, the expandaudit utility has been updated so that it will not perform subsequent GETATTR calls if the original call to the same file identifier failed.
MapR-FS 24965 On large clusters, sometimes the bind failed with the message indicating unavailability of port when running MR jobs, specifically reducer tasks. With this fix, the new fs.mapr.bind.retries configuration parameter in core-site.xml file, if set to true, will retry to bind during client initialization for 5 minutes before failing. By default, the fs.mapr.bind.retries configuration parameter is set to false.
MapR-FS 24971 When the mirroring operation started after a CLDB failover, sometimes it was sending requests to slave CLDB where data was stale, resulting in the the mirroring operation hanging. If the CLDB failover happened again during this time, the new CLDB master was discarding data resynchronized by the old mirroring operation, but marking the mirroring operation as successful. This resulted in data mismatch between source and destination. With this fix, mirroring requests will be sent to master CLDB node only.
MapR-FS 25041 Whenever a newly added node was made the master of the name container, MFS crashed while deleting files in the background. With this fix, MFS will not crash when a newly added node is made the master of the name container.
MapR-FS 25184 If limit spread was enabled and the nodes were more than 85% full, CLDB did not allocate containers for IOs on local volumes. With this fix, CLDB will now allocate new containers to ensure that the IO does not fail.
MapR-FS 25290 In secure environment, while writes were in progress, num_groups got corrupted and caused the FUSE process to crash. With this fix, the FUSE process will not crash while writes are in progress.
MapR-FS 25308 MFS crashed when mirroring a mirror volume that was promoted to a read/write volume and edited, and then reverted to a mirror volume. With this fix, MFS will not crash when resynchronizing a mirror volume that was promoted to a read/write volume and edited, and then reverted to a mirror volume.
MapR-FS 25337 When too many files were open, writes through FUSE were failing with EAGAIN messages. With this fix:
  • The limit for open files is 64k.
  • If the number of open files exceed the limit, ENFILE message (rather than EAGAIN) will be logged.
  • If a request is stuck and/or failing, error will be logged periodically.
MapR-FS 25426 The server was rejecting encrypted writes as the expected length was not matching the RPC data length and this caused the server to crash. With this fix, the server will no longer crash as the expected length will always match the RPC data length for encrypted writes.
MapR-FS 25590 Sometimes the SP to Fileserver map became inconsistent across different kvstore tables due to a race condition, which caused the container lookup from slave CLDB to fail. With this fix, kvstore tables will be made consistent if they are inconsistent.
MapR-FS 25775 While uncaching was in progress, MFS writes were taking a long time. With this fix, because of better uncaching algorithm (which utilizes CPU efficiently), there will be an improvement in the overall speed of MFS (including writes) while uncaching is in progress.
MapR-FS 25829 The libMapRClient library required JVM to be installed on the client machine, which is not required by C and C++ programs. With this fix, libMapRClient library will no longer need JVM to be installed on the client machine for C and C++ programs.
MapR-FS 25848 After a rolling upgrade to 5.2 of namespace container nodes, ACE information was getting set on certain operations incorrectly causing operations to fail. With this patch, ACE information will be discarded after a rolling upgrade.
MapR-FS 25856 In the event of a CLDB failover, a table on the unreachable node is deleted and re-created by CLDB master. Sometimes, multiple container lookup threads from slave CLDBs trying to open/access that table during the failover caused CLDB exception. With this fix multiple threads can safely access unreachable node table.
MapR-FS 26025 A corrupt encrypted write results in a data decryption failure. As a result of the decryption failure, MFS returns an EINVAL. The master node for the write crashes when it receives an EINVAL from the replicas. In this case, the decryption failure should have resulted in an EBADMSG instead of an EINVAL. With this fix, an EBADMSG is returned in case of a decryption failure of data. Upon encountering an EBADMSG, MFS sends an ErrServerRetry to the client. The Client revalidates the CRC, tries decrypting the encrypted buffers, and then retries the write operation, making the client more resilient to memory and network corruptions.
MapR-FS 26054 Sometimes, the container was getting stuck in resync state because the resync operation was hanging. With this fix, the resync operation will no longer hang.
MapR-FS 26062 After installing patch 41809 on v5.2, the FUSE-based POSIX client failed to start. With this fix, the FUSE-based POSIX client will now start when the command to start the service is run.
MapR-FS 26093 Sometimes, MFS crashed after promoting destination mirror volume to read-write volume. With this fix, MFS will not crash after promoting destination mirror volume to read-write volume.
MapR-FS 26094 Sometimes MFS crashed because there were many SP cleaner threads between low and high threshold. With this fix, MFS will not crash because the cleaner is disabled if it is below the high threshold.
MapR-FS 26288 During rolling upgrade, if slave CLDB is upgraded before master CLDB, the slave CLDB may crash when accessing new KvStore tables. With this fix, slave CLDB will not crash on reading new tables, even if they are non-existent.
MapR-FS 26336 MFS was crashing during truncate operation because of the following:
  • ACE was set on the file
  • The file had more than one filelet
  • The final truncated size was such that it ended within any of the direct blocks of the last filelet
With this fix, MFS will no longer crash during truncate operation.
MapR-FS 26351 During disksetup, even if the mfs.ssd.trim.enabled configuration parameter was set to false, the device was getting trim calls. With this fix, MFS will not attempt to trim if the configuration parameter is set to false.
Hive, Tez 20965 When working with multiple clusters, synchronization issues was causing MapRFileSystem to return NullPointerException. With this fix, MapRFileSystem has been improved to better support working with multiple clusters and MapRFileSystem contains fixes for synchronization issues.
Hoststats 11349 Hoststats did not work on POSIX edge node. With this fix, hoststats can work on POSIX client edge nodes as well to display the statistics on MCS.
JobTracker 24700 The Job Tracker user interface failed with a NullPointerException when a user submitted a Hive job with a null value in a method. With this fix, the Job Tracker interface does not fail when a Hive job is run with a null value in a method.
MapReduce 24505 A job failed when the JvmManager went into an inconsistent state. With this fix, jobs no longer fail as a result of the JvmManager entering an inconsistent state.
MapReduce 25599 Race condition in jobtracker-start script could cause warden to start multiple jobtrackers. With this fix the start script loops and waits for a successful start of jobtracker before exiting, thus closing the window of the race condition.
MapReduce 25695 It was not possible to restrict the web access port range, so the YARN Mapreduce application master could open a web port anywhere in the ephemeral port range of the node where it was running. With this change, the YARN Mapreduce application master will only open its web port within the range specified by the mapred parameter: yarn.app.mapreduce.am.job.client.port-range
MCS 23257 In MCS, new NFS VIPs were visible in the NFS HA > VIP Assignments tab, but not in the NFS HA > NFS Setup tab. With this fix, the NFS VIPs will be available in both the NFS HA > VIP Assignments tab and the NFS HA > NFS Setup tab.
NFS 24315 If you use the NFS client and you used the dd command with iflag=direct, an incorrect amount of data may have been read. With this fix, the dd command will read exactly the expected amount of data when iflag=direct is set.
NFS 24446 Due to incorrect attribute cache handling in NFS server, the getattr call sometimes returned stale mtime because the attribute cache was not getting updated properly at the time of setattr. With this fix, the attributes are now properly cached.
NFS 24658 CLDB returned “no master” and an empty list for container lookup, which NFS server could not handle, because when multiple servers are down, there can be no master for a container. With this fix, NFS server will handle empty node list for container lookup.
NFS:Loopback 23652 The POSIX loopbacknfs client did not automatically refresh renewed service tickets. With this fix, the POSIX loopbacknfs client will:
  • Automatically use the renewed service ticket without requiring a restart if the ticket is replaced before expiration (ticket expiry time + grace period of 55 minutes). If the ticket is replaced after expiration (which is ticket expiry time + grace period of 55 minutes), the POSIX loopbacknfs client will not refresh the ticket as the mount will become stale.
  • Allow impersonation if a service ticket is replaced before ticket expiration (which is ticket expiry time + grace period of 55 minutes) with a servicewithimpersonation ticket.
  • Honor all changes in user/group IDs of the renewed ticket.
Pkg/deployment 24309 Symlinks that existed in a MapR 5.1 installation were not re-created during an upgrade to MapR 5.2. This problem resulted when the mapr-hadoop-core package was updated on a cluster with the incorrect version of the mapr-core-internal package. This problem can occur during an upgrade from any older MapR version to a newer MapR version. With this fix, the mapr-hadoop-core package has a new dependency for a specific version of mapr-core-internal. If the correct version of mapr-core-internal is not present, an error message is generated, and the mapr-hadoop-core package cannot be installed. Note that this fix is effective for MapR 5.2.1 or later installations.
RPC 24610 In a secure cluster, when there are intermittent connection drops (between MFS-MFS or client-MFS), the client and/or server could crash during authentication. With this fix, the client and/or server will not crash during authentication if there are intermittent connection drops.
Streams 23563 High CPU utilization occurs when the default buffering time for MapR Streams is set to 0. With this fix, CPU utilization and latency is reduced by having TimeBasedFlusher active only when there is work to do.
UI:CLI 24280 Running the maprcli dashboard info command occasionally throws a TimeoutException error. With this fix, the internal timeout command was increased to provide more allowance for command processing.
Warden 24119 Warden adjusts the FileServer (MFS) and Node Manager (NM) memory incorrectly when NM and TaskTracker (TT) are on the same node. This can result in too much memory being allocated to MFS. With this fix, Warden does not adjust MFS memory when NM and TT are on the same node. Memory adjustment is implemented only when TT and MapR-FS (but no NM) are on the same node.
Warden 24562 CLDB (container location database) performance suffered because Warden gave the CLDB service a lower CPU priority. With this fix, Warden uses a new algorithm to set the correct CPU priority for the CLDB service.
Yarn 24477 Jobs failed if a local volume was not available and directories for mapreduce could not be initialized. With this fix, jobs no longer fail, and local volume recovery is enhanced.
YARN 25387 A null pointer exception (NPE) was generated when the capacity scheduler was enabled. Adding a node that does not contain a label can result in an NPE. With this fix, errors are no longer generated when the capacity scheduler is enabled.
YARN 25412 Mapreduce jobs fail if the Application Master (AM) is restarted for any reason -- for example, because of a node failure -- during a job commit and leaves a control file that prevents subsequent commit attempts. With this fix, MAPREDUCE-5485 is backported to MapR 5.1. MAPREDUCE-5485 adds a clean-up of commit-stage files. If the first commit attempt fails, temporary files are removed, allowing the next repeatable commit attempt to write them again without throwing an exception. To benefit from this fix, the user must set the mapreduce.fileoutputcommitter.algorithm.version parameter to "2" in the mapred-site.xml file.
YARN 25654 At startup, while processing application-recovery data, the ResourceManager (RM) failed with a null pointer exception. With this fix, the ResourceManager starts correctly when processing application-recovery data.
Yarn/security 25448 A user's temporary log files for running jobs were not readable by another user from the same group in the RM UI. An exception with the message, "Exception reading log file. User 'mapr' doesn't own requested log file" was generated. With this fix, users in the same primary group can access user logs of other users in the group.
Yarn/Warden 25695 It was not possible to restrict the web access port range, so the YARN Mapreduce application master could open a web port anywhere in the ephemeral port range of the node where it was running. With this change, the YARN Mapreduce application master will only open its web port within the range specified by the mapred parameter: yarn.app.mapreduce.am.job.client.port-range