Question

如何将自定义参数传递给我的hadoop mapreduce作业？

例如，如果在我的司机中我有：

  public static void main(String[] args) throws Exception {
    try {
        String one = args[0];
        String two = args[1];
        System.out.println(two);
        System.out.println(one);
    }
    catch (ArrayIndexOutOfBoundsException e){
        System.out.println("ArrayIndexOutOfBoundsException caught");
    }
    finally {

    }
    Configuration conf = new Configuration();
    Job job = Job.getInstance(conf, "word count");
    job.setJarByClass(WordCount.class);
    job.setMapperClass(TokenizerMapper.class);
    job.setCombinerClass(IntSumReducer.class);
    job.setReducerClass(IntSumReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    FileInputFormat.addInputPath(job, new Path(args[2]));
    FileOutputFormat.setOutputPath(job, new Path(args[3]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
 }

我运行文件后，当我运行命令时：

hadoop jar str1 str2 /home/bli1/wordcount/wc.jar /user/bli1/wordcount/input /user/bli1/wordcount/testout

我明白了：

Not a valid JAR: /nfsdata/DSCluster/home/bli1/wordcount/str1

Answer 1

参数需要在JAR文件引用之后，例如：

hadoop jar /home/bli1/wordcount/wc.jar str1 str2 /user/bli1/wordcount/input /user/bli1/wordcount/testout

Hadoop接受Driver的自定义参数

1 个答案: