MapR Streams

MapR Streams brings integrated publish/subscribe messaging to the MapR Converged Data Platform.

Producer applications can publish messages to topics, which are logical collections of messages, that are managed by MapR Streams. Consumer applications can then read those messages at their own pace. All messages published to MapR Streams are persisted, allowing future consumers to “catch-up” on processing, and analytics applications to process historical data.

In addition to reliably delivering messages to applications within a single data center, MapR Streams can continuously replicate data between multiple clusters, delivering messages globally. Like other MapR services, MapR Streams has a distributed, scale-out design, allowing it to scale to billions of messages per second, millions of topics, and millions of producer and consumer applications.

Topics in MapR Streams are grouped into streams, which administrators can apply security, retention, and replication policies to. Combined with MapR-FS and MapR-DB in the MapR Converged Data Platform, streams allow organizations to create a centralized, secure, data lake that unifies files, database tables, and message topics.

MapR Streams is ideal for a variety of use cases, including:
Application event pipelines
Many types of applications generate event or log data that needs to be centrally stored and analyzed to gain insights about user activity or application performance. MapR Streams simplifies these pipelines by transporting events to a central location where they can undergo event-by-event transformation and analysis.
Database change capture
Most modern databases allow users to generate an event each time an entry is added or modified. These events can be produced to MapR Streams to keep systems like search indices and caches synchronized, as well as feed security or notification applications.
Internet of Things
The explosion in the number of smart devices and sensors has created many situations in which billions of data points are created by millions of geographically dispersed sensors. MapR Streams provides a reliable, global transport for these messages, allowing analytics to be done both at the source and at a central location.