如何将自定义参数传递给我的hadoop mapreduce作业?
例如,如果在我的司机中我有:
public static void main(String[] args) throws Exception {
try {
String one = args[0];
String two = args[1];
System.out.println(two);
System.out.println(one);
}
catch (ArrayIndexOutOfBoundsException e){
System.out.println("ArrayIndexOutOfBoundsException caught");
}
finally {
}
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[2]));
FileOutputFormat.setOutputPath(job, new Path(args[3]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
我运行文件后,当我运行命令时:
hadoop jar str1 str2 /home/bli1/wordcount/wc.jar /user/bli1/wordcount/input /user/bli1/wordcount/testout
我明白了:
Not a valid JAR: /nfsdata/DSCluster/home/bli1/wordcount/str1
答案 0 :(得分:1)
参数需要在JAR文件引用之后,例如:
hadoop jar /home/bli1/wordcount/wc.jar str1 str2 /user/bli1/wordcount/input /user/bli1/wordcount/testout