Hadoop Framework需要知道Mapper和Reducer中的输出数据类型,以便在运行时创建这些类型的实例,以反映Mapper和Reducer之间的值,以及在将实例从Reducer序列化到输出文件期间。因此,我们必须告诉Hadoop框架有关作业对象中的输出数据,如
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
job.setOutputKeyClass(NullWritable.class);
job.setOutputValueClass(Text.class);
It is not possible to infer the types at runtime from the class definitions of the Mapper and Reducer since Java Generics uses Type Erasure.
假设这是一个映射器类
public static class SelectClauseMapper
extends Mapper<LongWritable, Text, NullWritable, Text> {
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
if(!AirlineDataUtils.isHeader(value)){
StringBuilder output = AirlineDataUtils.mergeStringArray(
AirlineDataUtils.getSelectResultsPerRow(value),
",");
context.write(NullWritable.get(),new Text(output.toString()));
}
}
有人可以在上面的例子中解释为什么不能在运行时确定输出类型吗?