Get Started with Pig
About this task
/in/constitution.txt
in the mapr
user's directory on
the cluster, and store the results in the file wordcount.txt
.Procedure
- Download the ZIP file that contains constitution.txt and then extract the constitution.txt file.
-
Load the file onto the cluster and place it in the directory
/user/mapr/in
. -
In the terminal, type the command
pig
to start the Pig shell. -
At the
grunt>
prompt, type the following lines (press ENTER after each): After you type the last line, Pig starts a MapReduce job to count the words in the fileconstitution.txt
.A = LOAD '/user/mapr/in' USING TextLoader() AS (words:chararray);
B = FOREACH A GENERATE FLATTEN(TOKENIZE(*));
C = GROUP B BY $0;
D = FOREACH C GENERATE group, COUNT(B);
STORE D INTO '/user/mapr/wordcount';
-
When the MapReduce job is complete, type
quit
to exit the Pig shell and take a look at the contents of the directory/myvolume/wordcount
to see the results.