Supported Regular Expressions
Filters used with Scan operations support regular expressions. When you filter scans on
MapR-DB tables, you can use regular expressions that comprise the Perl-Compatible Regular Expressions
library, as well as a subset of the regular expressions that are supported in
java.util.regex.pattern
.
The tables in the following sections define the subset of Java regular expressions that are supported for MapR-DB tables.
Characters
Pattern |
Description |
---|---|
x |
The character x |
\\ |
The backslash character |
\0n |
The character with octal value 0n (0 <= n <= 7) |
\0nn |
The character with octal value 0nn (0 <= n <= 7) |
\xhh |
The character with hexadecimal value 0xhh |
\t |
The tab character ('\u0009') |
\n |
The newline (line feed) character ('\u000A') |
\r |
The carriage-return character ('\u000D') |
\f |
The form-feed character ('\u000C') |
\a |
The alert (bell) character ('\u0007') |
\e |
The escape character ('\u001B') |
\cx |
The control character corresponding to x |
Character Classes
Pattern |
Description |
---|---|
[abc] |
a, b, or c (simple class) |
[Supported Regular Expressions in MapR Tables^abc] |
Any character except a, b, or c (negation) |
[a-zA-Z] |
a through z or A through Z, inclusive (range) |
Predefined Character Classes
Pattern |
Description |
---|---|
. |
Any character (may or may not match line terminators) |
\d |
A digit: [0-9] |
\D |
A non-digit: [Supported Regular Expressions in MapR Tables^0-9] |
\s |
A whitespace character: [ \t\n\x0B\f\r] |
\S |
A non-whitespace character: [Supported Regular Expressions in MapR Tables^\s] |
\w |
A word character: [a-zA-Z_0-9] |
\W |
A non-word character: [Supported Regular Expressions in MapR Tables^\w] |
Classes for Unicode Blocks and Categories
Pattern |
Description |
---|---|
\p{Lu} |
An uppercase letter (simple category) |
\p{Sc} |
A currency symbol |
Boundaries
Pattern |
Description |
---|---|
^ |
The beginning of a line |
$ |
The end of a line |
\b |
A word boundary |
\B |
A non-word boundary |
\A |
The beginning of the input |
\G |
The end of the previous match |
\Z |
The end of the input but for the final terminator, if any |
\z |
The end of the input |
Greedy Quantifiers
Pattern |
Description |
---|---|
X? |
X, once or not at all |
X* |
X, zero or more times |
X+ |
X, one or more times |
X{n} |
X, exactly n times |
X{n,} |
X, at least n times |
X{n,m} |
X, at least n but not more than m times |
Reluctant Quantifiers
Pattern |
Description |
---|---|
X?? |
X, once or not at all |
X*? |
X, zero or more times |
X+? |
X, one or more times |
X{n}? |
X, exactly n times |
X{n,}? |
X, at least n times |
X{n,m}? |
X, at least n but not more than m times |
Possessive Quantifiers
Pattern |
Description |
---|---|
X?+ |
X, once or not at all |
X*+ |
X, zero or more times |
X++ |
X, one or more times |
X{n}+ |
X, exactly n times |
X{n,}+ |
X, at least n times |
X{n,m}+ |
X, at least n but not more than m times |
Logical Operators
Pattern |
Description |
---|---|
XY |
X followed by Y |
X|Y |
Either X or Y |
(X) |
X, as a capturing group |
Back References
Pattern |
Description |
---|---|
\n |
Whatever the nth capturing group matches |
Quotation
Pattern |
Description |
---|---|
\ |
Nothing, but quotes the following character |
\Q |
Nothing, but quotes all characters until \E |
\E |
Nothing, but ends quoting started by \Q |
Special Constructs
Pattern |
Description |
---|---|
(?:X) |
X, as a non-capturing group |
(?=X) |
X, via zero-width positive lookahead |
(?!X) |
X, via zero-width negative lookahead |
(?<=X) |
X, via zero-width positive lookbehind |
(?<!X) |
X, via zero-width negative lookbehind |
(?>X) |
X, as an independent, non-capturing group |