Question

我已将1M行数据插入到hbase表中。然后我正在编写一个java程序来测试基于行键的HBase的读取性能。

//create a list which contains 10,000 row keys 
for(int i=0; i<10000; i++)
{
   list.add(rowkey);
}

//go through the list and check the rowkey exists in HBase or not
for(int i=0; i<list.size(); i++)
{
    Get g = new Get(list.get(i));
    g.setFilter(new KeyOnlyFilter());
    Result r = table.get(g);
    // ...

}

rowkey格式，如“12345_54321”。在测试我的程序之后，加载所有10,000行键以检查它是否存在大约需要50秒，所以每200 / s。

此读取性能非常慢，我还将过滤器添加到Get对象中。有没有其他方法可以改善上述表现？或者我的程序有问题？

Answer 1

较低的性能主要是因为您在每次迭代中执行比较并触发get，我认为显然需要一些时间，hbase并非旨在为您提供实时性能。

Answer 2

您可以使用exists（）API来执行此操作。这是一个例子，希望它有所帮助。

        List<Get> gets = new ArrayList<Get>();
        for (String rowKey : rowKeys) {
            Get get = new Get(Bytes.toBytes(rowKey));
            gets.add(get);
        }

        Set<String> newRows = new HashSet<String>();
        Boolean[] results;
        HTableInterface table = getHTableInterface(tableName);
        results = table.exists(gets);

HBase基于行键读取性能

2 个答案: