Example: Create an ORC file in MapR-FS by Storing the Data in a Hive table and Uploading it to Pig
About this task
Procedure
-
Create a sample test data file:
cd /home/mapr nano test_pig.data chown mapr:mapr test_pig.data
-
Add data to the file.
John,Smith Brian,May Rodger,Taylor John,Deacon Max,Plank Freddie,Mercury Albert,Einstein Fedor,Dostoevsky Lev,Tolstoy Niccolo,Paganini
NOTE: Do not include any extra lines at the end of the file. -
Upload the test data to a Hive table:
sudo -u mapr hive hive> create table test_pig(first_name string, last_name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; hive> load data local inpath '/home/mapr/test_pig.data' overwrite into table test_pig;
-
Create a Hive table with ORC storage:
hive> create table test_pig_orc(first_name string, last_name string) stored as orc tblproperties ("orc.compress"="NONE"); hive> insert overwrite table test_pig_orc select * from test_pig; hive> select * from test_pig_orc;
-
Check that the ORC file was created:
hadoop fs -ls /user/hive/warehouse/test_pig_orc
-
Upload the ORC file to Pig:
sudo -u mapr pig grunt> B = load '/user/hive/warehouse/test_pig_orc/000000_0' using OrcStorage(); grunt> dump B;