Querying Messages in MapR Streams

Description of the `Streams` Class

By using the Streams class, your applications can query MapR streams directly. You do not have first to load the data from your streams into another cluster that is used only for analytics and reporting.

The methods in this class each return a DocumentStore object that contains the messages in topics for a specified stream. The DocumentStore interface is a part of the open-source OJAI API.

You can then use the DocumentStore.find() methods to query the messages that are in the DocumentStore object.

The logical schema of each message is as follows. Your analytics applications can run queries on these fields.

{
    "_id":<STRING>,
    "topic":<STRING>,
    "partition":<SHORT>,
    "offset":<LONG>,
    "timestamp":<LONG>,
    "producer":<VARCHAR>,
    "key":<BINARY>,
    "value":<VARBINARY>
}

Field	Description
`_id`	A STRING value that represents the ID of the topic in which the message is located.
`topic`	A STRING value that represents the name of the topic in which the message is located.
`partition`	A SHORT value that represents the index of the partition in the topic.
`offset`	A LONG value that represents the position of the message within a partition.
`timestamp`	A LONG value that represents the date and time at which the message was sent to the stream.
`producer`	A VARCHAR value that represents the value of the `client.id` configuration parameter for the producer that published the message. MapR Streams does not require a value for this configuration parameter, so the value for this field could be empty.
`key`	A BINARY value that represents the key of the message. MapR Streams does not require each message to have a key, so this value could be empty. The configuration parameter `key.serializer` for the producer that published the message specifies the means by which the key was serialized. Your application can deserialize the key by using the appropriate deserialization class in the `org.apache.kafka.common.serialization` package or a class that implements the `Deserializer` interface.
`value`	A VARBINARY value that represents the value of the message. The configuration parameter `value.serializer` for the producer that published the message specifies the means by which the value was serialized. Your application can deserialize the value by using the appropriate deserialization class in the `org.apache.kafka.common.serialization` package or a class that implements the `Deserializer` interface.

Sample Application

mapr streamanalyzer

Links to Javadoc:

Building Applications that Use the `Streams` Class

You can compile and run a Java application that uses the required JAR file from the MapR Maven repository or from the MapR installation.

Using the JAR File from the MapR Maven Repository

Add MapR's Maven repository to your pom.xml file, if it is not already added:

    <repositories>
    <repository>
      <id>mapr-releases</id>
      <url>http://repository.mapr.com/nexus/content/repositories/releases</url>
      <snapshots><enabled>true</enabled></snapshots>
      <releases><enabled>true</enabled></releases>
    </repository>
  </repositories>

Add a dependency to the mapr-streams project:

    <dependency>
      <groupId>com.mapr.streams</groupId>
      <artifactId>mapr-streams</artifactId>
      <version>5.1.0-mapr</version>
    </dependency>

Using JARs from the MapR Installation

Use the following command to compile applications:
```
javac -cp `mapr classpath` <Application jars>
```

Use the following command to run applications:

java -cp `mapr classpath`:. <Main class jar> <commandline arguments>

Querying Messages in MapR Streams

Description of the Streams Class

Sample Application

Links to Javadoc:

Building Applications that Use the Streams Class

Description of the `Streams` Class

Building Applications that Use the `Streams` Class