我正在运行一个在作业中有分区类的hadoop代码。但是,当我运行命令
时hadoop jar Sort.jar SecondarySort inputdir outputdir
我收到一个说
的运行时错误class KeyPartitioner not org.apache.hadoop.mapred.Partitioner.
我已确保KeyPartitioner类已扩展了Partitioner类,但为什么会出现此错误?
以下是驱动程序代码:
JobConf conf = new JobConf(getConf(), SecondarySort.class);
conf.setJobName(SecondarySort.class.getName());
conf.setJarByClass(SecondarySort.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
conf.setMapOutputKeyClass(StockKey.class);
conf.setMapOutputValueClass(Text.class);
conf.setPartitionerClass((Class<? extends Partitioner<StockKey, DoubleWritable>>) KeyPartitioner.class);
conf.setMapperClass((Class<? extends Mapper<LongWritable, Text, StockKey, DoubleWritable>>) StockMapper.class);
conf.setReducerClass((Class<? extends Reducer<StockKey, DoubleWritable, Text, Text>>) StockReducer.class);
以下是分区程序类的代码:
public class KeyPartitioner extends Partitioner<StockKey, Text> {
@Override
public int getPartition(StockKey arg0, Text arg1, int arg2) {
int partition = arg0.name.hashCode() % arg2;
return partition;
}
}
答案 0 :(得分:1)
请注意hadoop中有两个分区:
org.apache.hadoop.mapreduce.Partitioner
org.apache.hadoop.mapred.Partitioner
确保您的KeyPartitioner
类实现第二个接口,而不是第一个抽象类。
编辑:您必须设置输入和输出文件夹:
FileInputFormat.addInputPath(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));