Question

我想将hbase中的记录从rowkey x扫描到rowkey y，我想在这些扫描中指定一个过滤器，我知道当我们做这样的操作时我们得到ResultScanner对象，有没有办法让得到结果的大小（在服务器端计算）

通常我想在mongo或sql中执行类似count（）的操作，而不需要遍历resultscanner。

感谢您的帮助

Answer 1

如果您为扫描提供可接受的缓存，那么简单的方法是仅请求可用的最小列。

如果是大型客户端扫描，或者您想在区域服务器上执行所有操作，则可以使用AggregationClient协处理器（0.92+，必须先启用）。如果是大扫描，MapReduce工作是你最好的朋友。

从http://michaelmorello.blogspot.com.es/2012/01/row-count-hbase-aggregation-example.html提取的工作AggregationClient示例：

public class MyAggregationClient {

    private static final byte[] TABLE_NAME = Bytes.toBytes("mytable");
    private static final byte[] CF = Bytes.toBytes("d");

    public static void main(String[] args) throws Throwable {

        Configuration customConf = new Configuration();
        customConf.setStrings("hbase.zookeeper.quorum",
                "node0,node1,node2");
        // Increase RPC timeout, in case of a slow computation
        customConf.setLong("hbase.rpc.timeout", 600000);
        // Default is 1, set to a higher value for faster scanner.next(..)
        customConf.setLong("hbase.client.scanner.caching", 1000);
        Configuration configuration = HBaseConfiguration.create(customConf);
        AggregationClient aggregationClient = new AggregationClient(
                configuration);
        Scan scan = new Scan();
        scan.addFamily(CF);
        long rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan);
        System.out.println("row count is " + rowCount);

    }
}

如果您需要实时回复，则必须实施＆amp;保持counters。

如何确定hbase中的结果大小？

1 个答案: