我正在使用CDH4(4.5)中的MRv1并面临CompositeInputFormat
的问题。我试图加入多少输入并不重要。为简单起见,这里只是一个输入的例子:
Configuration conf = new Configuration();
Job job = new Job(conf, "Blah");
job.setJarByClass(Blah.class);
job.setMapperClass(Blah.BlahMapper.class);
job.setReducerClass(Blah.BlahReducer.class);
job.setMapOutputKeyClass(LongWritable.class);
job.setMapOutputValueClass(BlahElement.class);
job.setOutputKeyClass(LongWritable.class);
job.setOutputValueClass(BlahElement.class);
job.setInputFormatClass(CompositeInputFormat.class);
String joinStatement = CompositeInputFormat.compose("inner", SequenceFileInputFormat.class, "/someinput");
System.out.println(joinStatement);
conf.set("mapred.join.expr", joinStatement);
job.setOutputFormatClass(SequenceFileOutputFormat.class);
FileOutputFormat.setOutputPath(job, new Path(newoutput));
return job.waitForCompletion(true) ? 0 : 1;
这是输出+堆栈跟踪:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop2/share/hadoop/mapreduce1/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/hadoop2/share/hadoop/common/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
14/01/31 03:27:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
inner(tbl(org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat,"/someinput"))
14/01/31 03:27:48 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
14/01/31 03:27:51 INFO mapred.JobClient: Cleaning up the staging area hdfs://archangel-desktop:54310/tmp/hadoop/mapred/staging/hadoop/.staging/job_201401302213_0013
14/01/31 03:27:51 ERROR security.UserGroupInformation: PriviledgedActionException as:hadoop (auth:SIMPLE) cause:java.io.IOException: Expression is null
Exception in thread "main" java.io.IOException: Expression is null
at org.apache.hadoop.mapreduce.lib.join.Parser.parse(Parser.java:542)
at org.apache.hadoop.mapreduce.lib.join.CompositeInputFormat.setFormat(CompositeInputFormat.java:85)
at org.apache.hadoop.mapreduce.lib.join.CompositeInputFormat.getSplits(CompositeInputFormat.java:127)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1079)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1096)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:177)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:995)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:948)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:948)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:566)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596)
at com.nileshc.graphfu.pagerank.BlockMatVec.run(BlockMatVec.java:79)
at com.nileshc.graphfu.Main.main(Main.java:21)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
以前有人遇到过这种情况吗?关于如何解决它的任何想法?
答案 0 :(得分:0)
我的坏。
conf.set("mapred.join.expr", joinStatement);
以上应该是:
job.getConfiguration().set("mapreduce.join.expr", joinStatement);
和
String joinStatement = CompositeInputFormat.compose("inner", SequenceFileInputFormat.class, "/someinput");
^^应该是:
String joinStatement = CompositeInputFormat.compose("inner", SequenceFileInputFormat.class, new Path("/someinput"));
第一个变化就是产生重大影响。
答案 1 :(得分:0)
在上面的代码中,
conf.set("mapred.join.expr", joinStatement);
在创建Job obeject之后编码上面的行。所以很明显Job对象不知道这个配置!!!!!!
请参阅以下修订后的代码: -
Configuration conf = new Configuration();
conf.set("mapred.join.expr", joinStatement);
Job job = new Job(conf, "Blah");
job.setJarByClass(Blah.class);
.
.
.
.
.
以下是另一种方式: -
job.getConfiguration().set("mapreduce.join.expr", joinStatement);
使用上述代码代替
conf.set("mapred.join.expr", joinStatement);