我使用java,我想做批次这样的
final List<Get> gets = uids.stream()
.map(uid -> new Get(toBytes(uid)))
.collect(Collectors.toList());
Configuration configuration = HBaseConfiguration.create();
conf.set("hbase.zookeeper.quorum", quorum);
conf.set("hbase.zookeeper.property.clientPort", properties.getString("HBASE_CONFIGURATION_ZOOKEEPER_CLIENTPORT"));
conf.set("zookeeper.znode.parent", properties.getString("HBASE_CONFIGURATION_ZOOKEEPER_ZNODE_PARENT"));
HTable table = new HTable(configuration, tableName);
return table.get(gets);
当获取列表有10K时,一切正常。
当我尝试在一批中进行100K获取时,我有例外:
java.lang.RuntimeException: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 100000 actions: SocketTimeoutException: 100000 times,
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 100000 actions: SocketTimeoutException: 100000 times,
at org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:203) ~[hbase-query-layer-r575958b.jar:?]
at org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:187) ~[hbase-query-layer-r575958b.jar:?]
at org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:922) ~[hbase-query-layer-r575958b.jar:?]
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:2402) ~[hbase-query-layer-r575958b.jar:?]
at org.apache.hadoop.hbase.client.HTable.batchCallback(HTable.java:868) ~[hbase-query-layer-r575958b.jar:?]
at org.apache.hadoop.hbase.client.HTable.batchCallback(HTable.java:883) ~[hbase-query-layer-r575958b.jar:?]
at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:858) ~[hbase-query-layer-r575958b.jar:?]
at org.apache.hadoop.hbase.client.HTable.get(HTable.java:825) ~[hbase-query-layer-r575958b.jar:?]
at hbase_query_layer.hbase.HbaseConnector.get(HbaseConnector.java:89) ~[hbase-query-layer-r575958b.jar:?]
... 15 more
有什么问题?
另外,我看到(在网络界面中)对regionserver(存储表的位置)的请求增长(批量大小为100K,几分钟后我看到请求数量为700K并且仍在增长,但只有我的客户写入某事这张表)。
此外,在hbase regionserver中,我在hbase-hbase-regionserver.out
中看到了Exception in thread "RpcServer.handler=25,port=60020" java.lang.StackOverflowError
at org.apache.hadoop.hbase.CellUtil$1.advance(CellUtil.java:203)
at org.apache.hadoop.hbase.CellUtil$1.advance(CellUtil.java:203)
如何解决?
答案 0 :(得分:1)
我发现问题:https://issues.apache.org/jira/browse/HBASE-11813
不幸的是我有HBase版本0.98.0.2.1.1.0-385-hadoop2所以我需要创建像:
final List<List<Increment>> batchesToExecute = chopped(increments, conf.getBatchIncrementSize());
static <T> List<List<T>> chopped(List<T> list, final int L) {
List<List<T>> parts = new ArrayList<>();
final int N = list.size();
for (int i = 0; i < N; i += L) {
parts.add(new ArrayList<>(list.subList(i, Math.min(N, i + L))));
}
return parts;
}