这是我的猪脚本,用于计算字数:
book = LOAD '/data/pg5000.txt' using PigStorage() as (lines:chararray);
words = foreach book generate FLATTEN(TOKENIZE(lines)) as word;
words_grouped = group words by word;
words_agg = foreach words_grouped generate group as word, COUNT(words);
words_sorted = ORDER words_agg BY $1 DESC;
STORE words_sorted into '/output/result_pig' using PigStorage(':');
在我开始作业pig wordcount.pig
后,作业成功完成。结果是
HDFS。但控制台显示以下内容:(作业成功但写入和读取的总记录为0)
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.2.0 0.12.2-SNAPSHOT sridhar 2014-07-07 11:30:50 2014-07-07 11:31:36 GROUP_BY,ORDER_BY
Success!
Job Stats (time in seconds):
JobId Maps Reduces MaxMapTime MinMapTIme AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs
job_local1242482660_0003 1 1 0 0 0 0 0 0 0 0 words_sorted ORDER_BY /output/result_pig,
job_local27414860_0002 1 1 0 0 0 0 0 0 0 0 words_sorted SAMPLER
job_local490629346_0001 1 1 0 0 0 0 0 0 0 0 book,words,words_agg,words_grouped GROUP_BY,COMBINER
Input(s):
Successfully read 0 records from: "/data/pg5000.txt"
Output(s):
Successfully stored 0 records in: "/output/result_pig"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local490629346_0001 -> job_local27414860_0002,
job_local27414860_0002 -> job_local1242482660_0003,
job_local1242482660_0003
2014-07-07 11:31:36,288 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!