Getting Started with MapR Streams

If you have a basic understanding of MapR Streams's components and the typical flow of messages from producers to consumers, you can give MapR Streams a try.

Prerequisites

  • Ensure that your Linux, Windows, or OS X system has Java SDK 7 or later installed.
  • Install the latest version of MapR on a cluster.
  • Install the MapR client package, if you want to run the producer and consumer from a machine outside the cluster.

About this task

Procedure

  1. On a node in the MapR cluster, follow these steps:
    1. Create a stream.
      • Run this command if you plan to run the producer and consumer with the same user ID that you are using to create the stream:

        maprcli stream create -path /<path to and name of the stream>

      • Run this command if you plan to run the producer and consumer with user IDs that are different from the user ID that you are using to create the stream:

        maprcli stream create -path /<path to and name of the stream> -consumeperm u:<user ID> -produceperm u:<user ID>

        The two additional parameters grant security permissions. By default, these permissions are granted to the user ID that ran the maprcli stream create command.
        -consumeperm
        Grants permission to read messages from topics that are in the stream.
        -produceperm
        Grants permission to publish messages to topics that are in the stream.
    2. Create a topic.

      Run this command to create the topic:

      maprcli stream topic create -path <path and name of the stream> -topic <name of the topic>

  2. On the system where the MapR client is installed, compile and launch the Java consumer first and then launch the Java producer.
    In both the consumer and producer, change this text to the path and name of your stream and to the name of the first of the topics:

    /<path to and name of the stream>:<name of topic>

    For the steps of compiling and launching, see Compiling and Launching Producers and Consumers.

    Launch the consumer first, and then launch the producer. If you launch the producer first and then the consumer, the producer publishes 50 messages, but the consumer (as consumers do by default) starts reading from the head of the partition, which is after the 50 messages.

    Figure 1. Result of starting the producer before starting the consumer for this step

    If you launch the consumer first, the partition is empty and the consumer continuously polls for new messages.

    Figure 2. The position of a consumer on an empty partition

    After you launch the producer, the fifty messages are published to the partition, and the consumer can move forward in the partition, reading the messages.

    Figure 3. Result of starting the consumer first and then starting the producer for this step