在Hadoop Streaming中设置numReduceTasks = 0时,o / p中缺少记录

时间:2012-01-16 00:13:54

标签: hadoop mapreduce hadoop-streaming

正如标题中已经提到的那样,您能否提出可能存在的问题。

命令

hadoop jar /usr/lib/hadoop/contrib/streaming/hadoop-streaming-0.20.2-cdh3u2.jar \

-input / usr / pkansal / ex2 / output \

-output / usr / pkansal / ex2 / output2 \

-mapper /home/cloudera/ex2/kMerFreqMap2.py \

-file /home/cloudera/ex2/kMerFreqMap2.py \

-numReduceTasks 0(如果我评论这一行,那么事情就好了

I / P

3 chr1:1,chr1:3,chr1:5

1 chr1:7

2 chr1:2,chr1:4

1 chr1:6

预期的O / P

chr1 1 3

chr1 3 3

chr1 5 3

chr1 7 1

chr1 2 2

chr1 4 2

chr1 6 1

实际O / P

chr1 2 2

chr1 4 2

chr1 6 1

0 个答案:

没有答案