从Java复制HBase中的表

时间:2014-05-14 10:07:23

标签: java hbase nosql

我想使用Java API将数据从一个HBase表复制到另一个HBase表,但无法找到。是否有任何Java API可以做同样的事情?

感谢。

1 个答案:

答案 0 :(得分:0)

以下不是迄今为止最优化的方式 - 但从问题的基调来看,性能似乎不是这里的关键因素。

首先,您需要设置HBaseConfiguration和输入/输出表:

配置config = HBaseConfiguration.create();

HTable inputTable = new HTable(config,“input_table”); HTable outputTable = new HTable(config,“output_table”);

您想要的是“扫描”,它允许执行范围扫描。您需要通过向Scan对象添加列来定义查询参数。

Scan scan = new Scan(Bytes.toBytes("smith-"));
scan.addColumn(Bytes.toBytes("personal"), Bytes.toBytes("givenName"));
scan.addColumn(Bytes.toBytes("contactinfo"), Bytes.toBytes("email"));
scan.setFilter(new PageFilter(25));

现在您已准备好调用扫描对象并检索结果:

ResultScanner scanner = inputTable.getScanner(scan);
for (Result result : scanner) {
    putToOutputTable(result);
}

现在要保存到第二个表,您将在for循环中执行Put,或者将结果聚合到List / Array或类似的批量放置中。

protected void putToOutputTable(Result result) {

    // Retrieve the Map of families to their most recent qualifiers and values.

    NavigableMap<byte[],NavigableMap<byte[],byte[]>>  map = result.getNoVersionMap();

    for (  // iterate through the family/values map entries for this result ) {
    // Convert the result to the row key and the column values here ..
    // specifically set the rowKey, colFamily, colQualifier, and colValue(s)

    Put p = new Put(Bytes.toBytes(rowKey));

    // To set the value you'd like to update in the row 'myLittleRow',
    // specify the column family, column qualifier, and value of the table
    // cell you'd like to update. The column family must already exist
    // in your table schema. The qualifier can be anything.
    // All must be specified as byte arrays as hbase is all about byte
    // arrays. Lets pretend the table 'myLittleHBaseTable' was created
    // with a family 'myLittleFamily'.
    p.add(Bytes.toBytes(colFamily), Bytes.toBytes(colQualifier),
    Bytes.toBytes(colValue));
 }
 table.put(p);

}

如果您想要一个更具伸缩性的版本,请查看如何使用map / reduce从输入hdfs文件读取/写入输出hbase表:Hbase Map/Reduce