mapr streamanalyzer

This light-weight utility, which is a sample application for the Streams Java class for analytics on HPE Ezmeral Data Fabric Streams, lets you count the messages in a stream or a subset of the topics in a stream. The utility also lets you print either whole retrieved messages or a subset of the fields in each message.

You can download the source code for this utility here: StreamAnalyzer.java

For information about the Streams Java class and building applications that use it, see HPE Ezmeral Data Fabric Streams Java API Library. See Logical Schema of Messages for information about how messages are structured.

Ensure that the user ID that runs the utility has the readAce permission on the volume where the stream is located. For information about how to set permissions on volumes, see Setting Whole Volume ACEs.

NOTE The mapr user is not treated as a superuser. HPE Ezmeral Data Fabric Streams does not allow the mapr user to run this utility unless that user is given the relevant permission or permissions with access-control expressions.

Syntax

mapr streamanalyzer -path <stream-full-name>
[ -topics <comma separated topic names> ]
[ -regex  <regular expression representing topic names> ]
[ -countMessages <true/false> (default: true) ]
[ -printMessages <true/false> (default: false) ]
[ -projectFields <comma separated field names> (default: all fields) ]

Parameters

Parameter Description
path The path and name of the stream.
topics A comma-separated list of the names of topics to retrieve. If you do not specify this parameter or the -regex parameter, all of the topics in the stream are retrieved.

Do not use this parameter if you use the -regex parameter.

regex A regular expression that represents the names of the topics to retrieve. If you do not specify this parameter or the -topics parameter, all of the topics in the stream are retrieved.

Do not use this parameter if you use the -topics parameter.

countMessages Prints the number of retrieved messages to the standard output.
printMessages Prints the contents of retrieved messages to the standard output.
projectFields If the -printMessages parameter is set to true, this parameter causes only the specified fields to be printed to the standard output for each message. In the list of field names, separate the names with commas. Default: all fields.

Valid field names: key, value, topic, offset, partition, and producer.

If the -printMessages parameter is set to false, this parameter has no effect.