Question

我在项目中使用Jimmy Lin的Github回购[1]。但是我注意到ArrayListOfDoublesWritable正在返回DoubleWritable。如果我在Reduce阶段使用它，这不是问题，即：

public static class Reduce extends Reducer<Text,DoubleWritable,Text,ArrayListOfDoublesWritable>

原因是我可以将方法 setOutputValueClass 的参数设置为 DoubleWritable.class 。

但是当我在Map阶段使用它时，情况似乎并非如此。 Hadoop抱怨说它实际上在接收ArrayListOfDoublesWritable时期待DoubleWritable。

有没有办法将Map Value Class设置为与声明的值不同？我已经完成了方法 setMapOutputValueClass ，但这似乎不是解决此问题的方法。

- 驱动程序 -

Configuration conf = new Configuration(); 
String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); // get all args
if (otherArgs.length != 2) {
  System.err.println("Usage: WordCount <in> <out>");
  System.exit(2);
}


Job job = new Job(conf, "job 1");
job.setJarByClass(Kira.class);

job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);

// set output key type   
job.setOutputKeyClass(Text.class);
// set output value type
job.setOutputValueClass(DoubleWritable.class);

//set the HDFS path of the input data
FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
// set the HDFS path for the output
FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));

//Wait till job completion
System.exit(job.waitForCompletion(true) ? 0 : 1);

注意我的输出值类设置为DoubleWritable.class，尽管我已经声明了ArrayListOfDoublesWritable。

如何为Mapper做到这一点？

发出与声明的值不同的值类[MapReduce]

0 个答案: