Getting Started in HBase

About this task

In this tutorial, we'll create an HBase table on the cluster, enter some data, query the table, then clean up the data and exit.

HBase tables are organized by column, rather than by row. Furthermore, the columns are organized in groups called column families. When creating an HBase table, you must define the column families before inserting any data. Column families should not be changed often, nor should there be too many of them, so it is important to think carefully about what column families will be useful for your particular data. Each column family, however, can contain a very large number of columns. Columns are named using the format family:qualifier.

Unlike columns in a relational database, which reserve empty space for columns with no values, HBase columns simply don't exist for rows where they have no values. This not only saves space, but means that different rows need not have the same columns; you can use whatever columns you need for your data on a per-row basis.

Procedure

  1. Start the HBase shell by typing the following command:
    hbase shell
  2. Create a table called weblog with one column family named stats:
    create 'weblog', 'stats'
  3. Verify the table creation by listing everything:
    list
  4. Add a test value to the daily column in the stats column family for row 1:
    put 'weblog', 'row1', 'stats:daily', 'test-daily-value'
  5. Add a test value to the weekly column in the stats column family for row 1:
    put 'weblog', 'row1', 'stats:weekly', 'test-weekly-value'
  6. Add a test value to the weekly column in the stats column family for row 2:
    put 'weblog', 'row2', 'stats:weekly', 'test-weekly-value'
  7. Type scan 'weblog' to display the contents of the table. Sample output:
    ROW                   COLUMN+CELL
     row1                 column=stats:daily, timestamp=1321296699190, value=test-daily-value
     row1                 column=stats:weekly, timestamp=1321296715892, value=test-weekly-value
     row2                 column=stats:weekly, timestamp=1321296787444, value=test-weekly-value
    2 row(s) in 0.0440 seconds
  8. Type get 'weblog', 'row1' to display the contents of row 1. Sample output:
    COLUMN                CELL
     stats:daily          timestamp=1321296699190, value=test-daily-value
     stats:weekly         timestamp=1321296715892, value=test-weekly-value
    2 row(s) in 0.0330 seconds
  9. Type disable 'weblog' to disable the table.
  10. Type drop 'weblog' to drop the table and delete all data.
  11. Type exit to exit the HBase shell.