在Map中我读了Hdfs文件更新到Hbase,
版本:hadoop 2.5.1 hbase 1.0.0
例外情况如下:
Exception in thread "main" org.apache.hadoop.mapred.InvalidJobConfException: Output directory not set.
可能
有问题job.setOutputFormatClass(TableOutputFormat.class);
此行提示:
The method setOutputFormatClass(Class<? extends OutputFormat>) in the type Job is not applicable for the arguments (Class<TableOutputFormat>)
代码如下:
public class HdfsAppend2HbaseUtil extends Configured implements Tool{
public static class HdfsAdd2HbaseMapper extends Mapper<Text, Text, ImmutableBytesWritable, Put>{
public void map(Text ikey, Text ivalue, Context context)
throws IOException, InterruptedException {
String oldIdList = HBaseHelper.getValueByKey(ikey.toString());
StringBuffer sb = new StringBuffer(oldIdList);
String newIdList = ivalue.toString();
sb.append("\t" + newIdList);
Put p = new Put(ikey.toString().getBytes());
p.addColumn("idFam".getBytes(), "idsList".getBytes(), sb.toString().getBytes());
context.write(new ImmutableBytesWritable(), p);
}
}
public int run(String[] paths) throws Exception {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.zookeeper.quorum", "master,salve1");
conf.set("hbase.zookeeper.property.clientPort", "2181");
Job job = Job.getInstance(conf,"AppendToHbase");
job.setJarByClass(cn.edu.hadoop.util.HdfsAppend2HbaseUtil.class);
job.setInputFormatClass(KeyValueTextInputFormat.class);
job.setMapperClass(HdfsAdd2HbaseMapper.class);
job.setNumReduceTasks(0);
job.setOutputFormatClass(TableOutputFormat.class);
job.getConfiguration().set(TableOutputFormat.OUTPUT_TABLE, "reachableTable");
FileInputFormat.setInputPaths(job, new Path(paths[0]));
job.setOutputKeyClass(ImmutableBytesWritable.class);
job.setOutputValueClass(Put.class);
return job.waitForCompletion(true) ? 0 : 1;
}
public static void main(String[] args) throws Exception {
System.out.println("Append Start: ");
long time1 = System.currentTimeMillis();
long time2;
String[] pathsStr = {Const.TwoDegreeReachableOutputPathDetail};
int exitCode = ToolRunner.run(new HdfsAppend2HbaseUtil(), pathsStr);
time2 = System.currentTimeMillis();
System.out.println("Append Cost " + "\t" + (time2-time1)/1000 +" s");
System.exit(exitCode);
}
}
答案 0 :(得分:0)
你没有提到输出目录,就像输入路径一样写输出。
像这样提到它。
FileOutputFormat.setOutputPath(job, new Path(<output path>));
答案 1 :(得分:0)
最后,我知道为什么,就像我认为有问题:
job.setOutputFormatClass(TableOutputFormat.class);
此行提示:
The method setOutputFormatClass(Class<? extends OutputFormat>) in the type Job is not applicable for the arguments (Class<TableOutputFormat>)
实际上我们需要导入
org.apache.hadoop.hbase.mapreduce.TableOutputFormat
不导入
org.apache.hadoop.hbase.mapred.TableOutputFormat
前者扩展自org.apache.hadoop.mapred.FileOutputFormat
请参阅: https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapred/TableOutputFormat.html
后来延伸自 org.apache.hadoop.mapreduce.OutputFormat
请参阅:
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.html
最后非常感谢大家!!!