通过HFile将数据加载到HBase中不起作用

时间:2017-06-05 01:16:20

标签: java hadoop mapreduce hbase hfile

我写了一个mapper来通过HFile将数据从磁盘加载到HBase中,程序运行成功,但我的HBase表中没有加载任何数据,请问有什么想法吗?

这是我的java程序:

protected void writeToHBaseViaHFile() throws Exception {
        try {
            System.out.println("In try...");
            Configuration conf = HBaseConfiguration.create();
            conf.set("hbase.zookeeper.quorum", "XXXX");
            Connection connection = ConnectionFactory.createConnection(conf);
            System.out.println("got connection");

            String inputPath = "/tmp/nuggets_from_Hive/part-00000";
            String outputPath = "/tmp/mytemp" + new Random().nextInt(1000);
            final TableName tableName = TableName.valueOf("steve1");
            System.out.println("got table steve1, outputPath = " + outputPath);

            // tag::SETUP[]
            Table table = connection.getTable(tableName);

            Job job = Job.getInstance(conf, "ConvertToHFiles");
            System.out.println("job is setup...");

            HFileOutputFormat2.configureIncrementalLoad(job, table,
                connection.getRegionLocator(tableName)); // <1>
            System.out.println("done configuring incremental load...");

            job.setInputFormatClass(TextInputFormat.class); // <2>

            job.setJarByClass(Importer.class); // <3>

            job.setMapperClass(LoadDataMapper.class); // <4>
            job.setMapOutputKeyClass(ImmutableBytesWritable.class); // <5>
            job.setMapOutputValueClass(KeyValue.class); // <6>

            FileInputFormat.setInputPaths(job, inputPath);
            HFileOutputFormat2.setOutputPath(job, new org.apache.hadoop.fs.Path(outputPath));
            System.out.println("Setup complete...");
            // end::SETUP[]

            if (!job.waitForCompletion(true)) {
                System.out.println("Failure");
            } else {
                System.out.println("Success");
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

这是我的映射器类:

public class LoadDataMapper extends Mapper<LongWritable, Text, ImmutableBytesWritable, Cell> {

    public static final byte[] FAMILY = Bytes.toBytes("pd");
    public static final byte[] COL = Bytes.toBytes("bf");
    public static final ImmutableBytesWritable rowKey = new ImmutableBytesWritable();

    @Override
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        String[] line = value.toString().split("\t"); // <1>
        byte[] rowKeyBytes = Bytes.toBytes(line[0]);
        rowKey.set(rowKeyBytes);
        KeyValue kv = new KeyValue(rowKeyBytes, FAMILY, COL, Bytes.toBytes(line[1])); // <6>
        context.write (rowKey, kv); // <7>
        System.out.println("line[0] = " + line[0] + "\tline[1] = " + line[1]);
    }

}

我已在群集中创建了表steve1,但在程序成功运行后获得了0行:

hbase(main):007:0> count 'steve1'
0 row(s) in 0.0100 seconds

=> 0

我尝试过的事情:

我尝试在mapper类中添加print out消息以查看它是否真正读取了数据,但打印输出从未在我的控制台中打印出来。 我对如何调试它感到茫然。

非常感谢任何想法!

1 个答案:

答案 0 :(得分:1)

这只是为了创建HFiles,你仍然需要将HFile加载到你的桌子上。例如,您需要执行以下操作:

ul.nav {
  display: grid;
  grid-auto-flow: column;
  align-items: center;
  justify-content: start;
  grid-gap: 10px;
  margin: 0;
  padding: 0;
  list-style: none;
}