我的工作配置如下,我正在尝试对我的hadoop工作做一个简单的两步链接,
public int run(String[] args) throws Exception {
Configuration conf = getConf();
if (args.length != 2) {
System.err.println("Usage: moviecount3 <in> <out>");
System.exit(2);
}
ConfigurationUtil.dumpConfigurations(conf, System.out);
LOG.info("input: " + args[0] + " output: " + args[1]);
Job job = new Job(conf, "movie count2");
job.setJarByClass(MovieCount3.class);
job.setMapperClass(MovieTokenizerMapper3.class);
job.setReducerClass(MovieActorReducer3.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(Text.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
boolean result = job.waitForCompletion(true);
//Job 2
Job job2 = new Job(conf, "movie count22");
if(job.isSuccessful()){
job2.setJarByClass(MovieCount3.class);
job2.setMapperClass(MovieActorCombiner3.class);
//job2.setReducerClass(MovieActorReducer3.class);
job2.setMapOutputKeyClass(Text.class);
job2.setMapOutputValueClass(IntWritable.class);
job2.setOutputKeyClass(IntWritable.class);
job2.setOutputValueClass(Text.class);
FileInputFormat.addInputPath(job, new Path("/user/test/output/part-r-00000"));
FileOutputFormat.setOutputPath(job, new Path("/user/test/output2"));
}
boolean result2 = job2.waitForCompletion(true);
return (result2) ? 0 : 1;
}
运行此配置时,我收到以下异常
12/11/22 14:21:30 INFO mapred.JobClient: Cleaning up the staging area hdfs://localhost:9000/media/LinuxDrive/hdfs-test/mapred/staging/test/.staging/job_201211221353_0010
12/11/22 14:21:30 ERROR security.UserGroupInformation: PriviledgedActionException as:test
cause:org.apache.hadoop.mapred.InvalidJobConfException: Output directory not set.
org.apache.hadoop.mapred.InvalidJobConfException: Output directory not set.
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:128)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:887)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
at ucsc.hadoop.mapreduce.movie.MovieCount3.run(MovieCount3.java:77)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at ucsc.hadoop.mapreduce.movie.MovieCount3.main(MovieCount3.java:84)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at ucsc.hadoop.mapreduce.ExampleDriver.main(ExampleDriver.java:101)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
谢谢!
答案 0 :(得分:2)
问题是,您没有为第二个作业设置输出目录。您再次为第一个作业设置输出:
假:
FileInputFormat.addInputPath(job, new Path("/user/test/output/part-r-00000"));
FileOutputFormat.setOutputPath(job, new Path("/user/test/output2"));
右:
FileInputFormat.addInputPath(job2, new Path("/user/test/output/part-r-00000"));
FileOutputFormat.setOutputPath(job2, new Path("/user/test/output2"));
Thath应该解决你的问题。