我在hdinsight群集中有一个托管的配置单元表。该表具有超过15亿条记录,并具有400多个分区。如果我将分区限制在300以下,则可以查询该表,但无法一次查询整个表。
testtable具有超过400个分区,partitionid的范围为1-12。 带有表格属性
"transactional"="TRUE",
"hive.input.dir.recursive" = "TRUE",
"hive.mapred.supports.subdirectories" = "TRUE",
"hive.supports.subdirectories" = "TRUE",
"mapred.input.dir.recursive" = "TRUE",
"serialization.null.format" = ""
以下查询有效:
SELECT COUNT(*) FROM testtable WHERE partitionid = '1';
以下查询也适用:
SELECT COUNT(*) FROM testtable WHERE partitionid IN ('1', '2', '3', '4', '5', '6');
以下查询失败:
SELECT COUNT(*) FROM testtable;
错误消息:
ERROR : FAILED: Error in acquiring locks: Error communicating with the metastore
org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the metastore
at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.lock(DbLockManager.java:178)
at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:447)
at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocksWithHeartbeatDelay(DbTxnManager.java:463)
at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:278)
at org.apache.hadoop.hive.ql.lockmgr.HiveTxnManagerImpl.acquireLocks(HiveTxnManagerImpl.java:76)
at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:95)
at org.apache.hadoop.hive.ql.Driver.acquireLocks(Driver.java:1651)
at org.apache.hadoop.hive.ql.Driver.lockAndRespond(Driver.java:1838)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2008)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1752)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1746)
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226)
at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:324)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:342)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.thrift.TApplicationException: Internal error processing lock
at org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_lock(ThriftHiveMetastore.java:5543)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.lock(ThriftHiveMetastore.java:5530)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.lock(HiveMetaStoreClient.java:2779)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)
at com.sun.proxy.$Proxy38.lock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2990)
at com.sun.proxy.$Proxy38.lock(Unknown Source)
at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.lock(DbLockManager.java:103)
... 25 more
我尝试更新配置单元群集中的以下设置,但它们似乎没有生效:
- hive.metastore.batch.retrieve.max
- hive.metastore.batch.retrieve.table.partition.max
- hive.metastore.limit.partition.request
有什么想法/指针吗?