Question

我有一个mapreduce程序并且工作正常，以下是map和reduce函数的签名。 outputcollector目前是

output.collect(newtext, new IntWritable(someintegervalue like 5)); //works ok

我需要将其更改为处理/输出double值。（需要将两个整数除以得到双倍的结果）。我尝试将outputcollector更改为以下

output.collect(newtext, new DoubleWritable(somedoublevalue like 5.1))

并且编译/运行有问题。希望尽量减少Map和Reduce签名的变化，因为程序运行正常，只需要输出double而不是整数。

以下是当前Map Reduce签名并正常工作。

class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> 

map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException

public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> 

public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {

Answer 1

不要忘记在配置作业时需要指定输出类，例如，您需要编写：

conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(DoubleWritable.class);

否则它会抱怨这样：

"type mismatch value from map: expected org.apache.hadoop.io.IntWritable,
 recieved org.apache.hadoop.io.DoubleWritable"

Answer 2

从您的评论中，您似乎没有在任何地方更改过签名。您需要将它们更改为以下内容：

class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, DoubleWritable> 

map(LongWritable key, Text value, OutputCollector<Text, DoubleWritable> output, Reporter reporter) throws IOException

public static class Reduce extends MapReduceBase implements Reducer<Text, DoubleWritable, Text, DoubleWritable> 

public void reduce(Text key, Iterator<DoubleWritable> values, OutputCollector<Text, DoubleWritable> output, Reporter reporter) throws IOException {

Hadoop outputCollector

2 个答案: