OJAI Documents
An OJAI document is a tree of fields. Each field has a type and a value, and also has either a name or an array index. Field names are strings. The root of each document is a map.
For example, an online retailer of sports equipment might have this OJAI document for storing data about a set of bicycle pedals:
{
"_id" : "2DT3201",
"product_ID" : "2DT3201",
"name" : " Allegro SPD-SL 6800",
"brand" : "Careen",
"category" : "Pedals",
"type" : "Components,
"price" : 112.99,
"features" : [
"Low-profile design",
"Floating SH11 cleats included"
],
"specifications" : {
"weight_per_pair" : "260g",
"color" : "black"
}
}
Data Types
- Scalar Data
- These fields can contain strings or numbers. The scalar fields in the sample document
are highlighted in bold below.
Scalar fields can contain the following data types:{ "_id" : "2DT3201", "product_ID" : "2DT3201", "name" : " Allegro SPD-SL 6800", "brand" : "Careen", "category" : "Pedals", "type" : "Components, "price" : 112.99, "features" : [ "Low-profile design", "Floating SH11 cleats included" ], "specifications" : { "weight_per_pair" : "260g", "color" : "black" } }
Data Type Description Binary An uninterpreted sequence of bytes. Boolean A data type of two possible values that are typically denoted by true and false. Byte A 8-bit signed integer. Date A 32-bit integer representing the number of DAYS since epoch, i.e. January 1, 1970 00:00:00 UTC. The value is absolute and is time-zone independent. Double A double-precision 64-bit floating-point number Float A single-precision 32-bit floating-point number Int A 32-bit signed integer Long A 64-bit signed integer Short A 16-bit signed integer String A sequence of characters. Time A 32-bit integer representing time of the day in milliseconds. The value is absolute and is time-zone independent. Timestamp A 64-bit integer representing the number of milliseconds since epoch, i.e. January 1, 1970 00:00:00 UTC. Negative values represent dates before epoch. - Nested Documents
- These fields can contain documents that themselves contain scalar data, nested
documents, and arrays. The nested document in the sample document is highlighted in bold
below.
{ "_id" : "2DT3201", "product_ID" : "2DT3201", "name" : " Allegro SPD-SL 6800", "brand" : "Careen", "category" : "Pedals", "type" : "Components, "price" : 112.99, "features" : [ "Low-profile design", "Floating SH11 cleats included" ], "specifications" : { "weight_per_pair" : "260g", "color" : "black" } }
- Arrays
- These fields contain lists of values that are accessible by means of index numbers.
The values can be scalar, documents, arrays, or a combination of any of these types. For
example, the array in the sample document is highlighted in bold below and contains
scalar values.
{ "_id" : "2DT3201", "product_ID" : "2DT3201", "name" : " Allegro SPD-SL 6800", "brand" : "Careen", "category" : "Pedals", "type" : "Components, "price" : 112.99, "features" : [ "Low-profile design", "Floating SH11 cleats included" ], "specifications" : { "weight_per_pair" : "260g", "color" : "black" } }
Schema Flexibility
The structure of each document, called the document's schema, is easy to change. Simply add new fields. For example, if the online retailer wanted to allow customers to review products, it would be simple to add the reviews to any document for a product.In this example, highlighted in bold, the comments are added as in an array of documents:
{
"_id" : "2DT3201",
"product_ID" : "2DT3201",
"name" : " Allegro SPD-SL 6800",
"brand" : "Careen",
"category" : "Pedals",
"type" : "Components,
"price" : 112.99,
"features" : [
"Low-profile design",
"Floating SH11 cleats included"
],
"specifications" : {
"weight_per_pair" : "260g",
"color" : "black"
},
"comments" : [
{
"username" : "hlmencken",
"comment" : "Best money I ever spent!"
},
{
"username" : "vwoolf",
"comment" : "What hlmencken said!"
}
]
}
Dotted Notation for Identifying Fields
A number of the Java methods and maprcli commands for working with OJAI documents require you to identify individual fields by specifying their paths. A field path is the name of each field in sequence that leads to the particular field that you are interested in. The names are separated by periods.For example, suppose you had a document with this structure:
{
"a" : {
"b" : {
"c" : {
"d" : "value_for_d"
}
}
}
}
The
path for field d would be a.b.c.d
. Tools for Working with OJAI Documents
- MapR-DB JSON API
- Learn the basics of this API here: Creating, Reading, Updating, and Deleting Documents and Tables with the MapR-DB JSON Java API Library
- The mapr dbshell
- This shell is a light-weight tool for manipulating JSON tables and
documents. Learn more about it here:
mapr dbshell