Architecture

Streams contain topics that have logical collections of messages.

In HPE Ezmeral Data Fabric Streams, topics are grouped into streams. Administrators can apply security, retention, and replication policies on streams. Combined with file system and HPE Ezmeral Data Fabric Database in the Data Fabric Data Platform, using these streams enables organizations to create a centralized, secure data lake that unifies files, database tables, and message topics.

Messages (topic data) are published to topics by Producer applications and are read by Consumer applications. All messages published to HPE Ezmeral Data Fabric Streams are persisted, allowing future consumers to “catch-up” on processing and analytics applications to process historical data. Additionally, messages are specifically written to topic partitions.

NOTE Topic partitions are stored in containers within volumes. Containers are written to storage pools, which are made up of disks on the nodes in the cluster. See Containers and the CLDB for more information about containers.

Why Use HPE Ezmeral Data Fabric Streams?

HPE Ezmeral Data Fabric Streams is ideal for a variety of use cases, including the following:
Application event pipelines
Many types of applications generate event or log data that must be centrally stored and analyzed to gain insights about user activity or application performance. HPE Ezmeral Data Fabric Streams simplifies these pipelines by transporting events to a central location, from which they can undergo event-by-event transformation and analysis.
Database change capture
Most modern databases enable users to generate an event each time an entry is added or modified. These events can be published to HPE Ezmeral Data Fabric Streams to keep systems like search indexes and caches synchronized, as well as to feed security or notification applications.
Internet of Things
The explosion in the number of smart devices and sensors has created many situations in which billions of data points are created by millions of geographically dispersed sensors. HPE Ezmeral Data Fabric Streams provides a reliable, global transport for these messages, enabling you to perform analytics both at the source and at a central location.

Replication

In addition to reliably delivering messages to applications within a single data center, HPE Ezmeral Data Fabric Streams can continuously replicate data between multiple clusters, delivering messages globally. Like other data-fabric services, HPE Ezmeral Data Fabric Streams has a distributed, scale-out design, allowing it to scale to billions of messages per second, millions of topics, and millions of producer and consumer applications.

Server and Client Libraries

Figure 1. The relationship of the HPE Ezmeral Data Fabric Streams server to producers, consumers, and client libraries
The relationship of the MapR-ES server to producers, consumers, and client libraries
Server
The server manages streams, topics, and partitions and handles requests from the producer client library and the consumer client library.
Producer client library
This client side library which is part of the producer process receives the messages that are sent by producers, buffers the messages, and sends them to the server, which then publishes the messages and sends the client acknowledgements.
Consumer client library
This client side library which is part of the consumer process receives requests from consumers to poll subscriptions for unread messages, reads messages from topic partitions, and sends messages to consumers.