我想使用Java API将数据从一个HBase表复制到另一个HBase表,但无法找到。是否有任何Java API可以做同样的事情?
感谢。
答案 0 :(得分:0)
以下不是迄今为止最优化的方式 - 但从问题的基调来看,性能似乎不是这里的关键因素。
首先,您需要设置HBaseConfiguration和输入/输出表:
配置config = HBaseConfiguration.create();
HTable inputTable = new HTable(config,“input_table”); HTable outputTable = new HTable(config,“output_table”);
您想要的是“扫描”,它允许执行范围扫描。您需要通过向Scan对象添加列来定义查询参数。
Scan scan = new Scan(Bytes.toBytes("smith-"));
scan.addColumn(Bytes.toBytes("personal"), Bytes.toBytes("givenName"));
scan.addColumn(Bytes.toBytes("contactinfo"), Bytes.toBytes("email"));
scan.setFilter(new PageFilter(25));
现在您已准备好调用扫描对象并检索结果:
ResultScanner scanner = inputTable.getScanner(scan);
for (Result result : scanner) {
putToOutputTable(result);
}
现在要保存到第二个表,您将在for循环中执行Put,或者将结果聚合到List / Array或类似的批量放置中。
protected void putToOutputTable(Result result) {
// Retrieve the Map of families to their most recent qualifiers and values.
NavigableMap<byte[],NavigableMap<byte[],byte[]>> map = result.getNoVersionMap();
for ( // iterate through the family/values map entries for this result ) {
// Convert the result to the row key and the column values here ..
// specifically set the rowKey, colFamily, colQualifier, and colValue(s)
Put p = new Put(Bytes.toBytes(rowKey));
// To set the value you'd like to update in the row 'myLittleRow',
// specify the column family, column qualifier, and value of the table
// cell you'd like to update. The column family must already exist
// in your table schema. The qualifier can be anything.
// All must be specified as byte arrays as hbase is all about byte
// arrays. Lets pretend the table 'myLittleHBaseTable' was created
// with a family 'myLittleFamily'.
p.add(Bytes.toBytes(colFamily), Bytes.toBytes(colQualifier),
Bytes.toBytes(colValue));
}
table.put(p);
}
如果您想要一个更具伸缩性的版本,请查看如何使用map / reduce从输入hdfs文件读取/写入输出hbase表:Hbase Map/Reduce