hdfs纯文本写入hbase,未设置输出目录

时间:2016-04-21 09:03:25

标签: java hadoop hbase

在Map中我读了Hdfs文件更新到Hbase,

版本:hadoop 2.5.1 hbase 1.0.0

例外情况如下:

Exception in thread "main" org.apache.hadoop.mapred.InvalidJobConfException: Output directory not set.

可能

有问题
job.setOutputFormatClass(TableOutputFormat.class); 

此行提示:

The method setOutputFormatClass(Class<? extends OutputFormat>) in the type Job is not applicable for the arguments (Class<TableOutputFormat>)

代码如下:

public class HdfsAppend2HbaseUtil extends Configured implements Tool{

    public static class HdfsAdd2HbaseMapper extends Mapper<Text, Text, ImmutableBytesWritable, Put>{

        public void map(Text ikey, Text ivalue, Context context) 
                throws IOException, InterruptedException {

            String oldIdList = HBaseHelper.getValueByKey(ikey.toString());

            StringBuffer sb = new StringBuffer(oldIdList);
            String newIdList = ivalue.toString();
            sb.append("\t" + newIdList);

            Put p = new Put(ikey.toString().getBytes());
            p.addColumn("idFam".getBytes(), "idsList".getBytes(), sb.toString().getBytes());
            context.write(new ImmutableBytesWritable(), p);

        }

    }

    public int run(String[] paths) throws Exception {

        Configuration conf = HBaseConfiguration.create();
        conf.set("hbase.zookeeper.quorum", "master,salve1");
        conf.set("hbase.zookeeper.property.clientPort", "2181");

        Job job = Job.getInstance(conf,"AppendToHbase");
        job.setJarByClass(cn.edu.hadoop.util.HdfsAppend2HbaseUtil.class);

        job.setInputFormatClass(KeyValueTextInputFormat.class);

        job.setMapperClass(HdfsAdd2HbaseMapper.class);
        job.setNumReduceTasks(0);

        job.setOutputFormatClass(TableOutputFormat.class); 

        job.getConfiguration().set(TableOutputFormat.OUTPUT_TABLE, "reachableTable");

        FileInputFormat.setInputPaths(job, new Path(paths[0]));


        job.setOutputKeyClass(ImmutableBytesWritable.class);
        job.setOutputValueClass(Put.class);


        return job.waitForCompletion(true) ? 0 : 1;
    }

    public static void main(String[] args) throws Exception {

        System.out.println("Append Start: ");

        long time1 = System.currentTimeMillis();
        long time2;
        String[] pathsStr = {Const.TwoDegreeReachableOutputPathDetail};

        int exitCode = ToolRunner.run(new HdfsAppend2HbaseUtil(), pathsStr);
        time2 = System.currentTimeMillis();
        System.out.println("Append Cost " + "\t" + (time2-time1)/1000 +" s");

        System.exit(exitCode);
    }
}

2 个答案:

答案 0 :(得分:0)

你没有提到输出目录,就像输入路径一样写输出。
像这样提到它。

FileOutputFormat.setOutputPath(job, new Path(<output path>));

答案 1 :(得分:0)

最后,我知道为什么,就像我认为有问题:

job.setOutputFormatClass(TableOutputFormat.class); 

此行提示:

The method setOutputFormatClass(Class<? extends OutputFormat>) in the type Job is not applicable for the arguments (Class<TableOutputFormat>)

实际上我们需要导入

org.apache.hadoop.hbase.mapreduce.TableOutputFormat

不导入

org.apache.hadoop.hbase.mapred.TableOutputFormat

前者扩展自org.apache.hadoop.mapred.FileOutputFormat

请参阅: https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapred/TableOutputFormat.html

后来延伸自     org.apache.hadoop.mapreduce.OutputFormat

请参阅:

https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.html

最后非常感谢大家!!!