Hadoop outputCollector

时间:2011-10-10 23:30:00

标签: hadoop

我有一个mapreduce程序并且工作正常,以下是map和reduce函数的签名。 outputcollector目前是

output.collect(newtext, new IntWritable(someintegervalue like 5)); //works ok

我需要将其更改为处理/输出double值。 (需要将两个整数除以得到双倍的结果)。 我尝试将outputcollector更改为以下

output.collect(newtext, new DoubleWritable(somedoublevalue like 5.1))

并且编译/运行有问题。希望尽量减少Map和Reduce签名的变化,因为程序运行正常,只需要输出double而不是整数。

以下是当前Map Reduce签名并正常工作。

class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> 

map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException

public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> 

public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {

2 个答案:

答案 0 :(得分:2)

不要忘记在配置作业时需要指定输出类,例如,您需要编写:

conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(DoubleWritable.class);

否则它会抱怨这样:

"type mismatch value from map: expected org.apache.hadoop.io.IntWritable,
 recieved org.apache.hadoop.io.DoubleWritable"

答案 1 :(得分:1)

从您的评论中,您似乎没有在任何地方更改过签名。您需要将它们更改为以下内容:

class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, DoubleWritable> 

map(LongWritable key, Text value, OutputCollector<Text, DoubleWritable> output, Reporter reporter) throws IOException

public static class Reduce extends MapReduceBase implements Reducer<Text, DoubleWritable, Text, DoubleWritable> 

public void reduce(Text key, Iterator<DoubleWritable> values, OutputCollector<Text, DoubleWritable> output, Reporter reporter) throws IOException {