我正在尝试将大数据加载到HIVE中的动态分区表中。
我继续收到此错误。如果我没有分区加载数据,它工作正常。如果我使用较小的数据集(带分区),它也可以正常工作。但对于大型数据集,我开始收到此错误
错误:
2014-11-10 09:28:01,112 ERROR org.apache.hadoop.hdfs.DFSClient: Failed to close file
/tmp/hive-username/hive_2014-11-10_09-25-26_785_2042278847834453465/_task_tmp.-ext-10002/
pseudo_element_id=NN%09/_tmp.000002_2
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
No lease on /tmp/hive-username/hive_2014-11-10_09-25-26_785_2042278847834453465/_task_tmp.-ext-10002
/pseudo_element_id=NN%09/_tmp.000002_2: File does not exist.
Holder DFSClient_NONMAPREDUCE_-737676454_1 does not have any open files.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2445)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2437)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:2503)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:2480)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:535)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:337)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44958)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1701)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1697)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1695)
at org.apache.hadoop.ipc.Client.call(Client.java:1225)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy10.complete(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:330)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.
java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy11.complete(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:1795)
at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1782)
at org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:709)
at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:726)
at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:561)
at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2398)
at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2414)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
答案 0 :(得分:1)
设置选项hive.exec.parallel = false并重新运行查询。
答案 1 :(得分:0)
当多个映射器尝试访问同一文件时,通常会发生这种情况。 检查代码是否存在导致同时访问同一文件的任何错误。当映射器访问已删除的文件时,也会发生这种情况。
答案 2 :(得分:0)
只是增加号码。每个节点默认允许分区,你的工作应该没问题。
SET hive.exec.max.dynamic.partitions=100000;
SET hive.exec.max.dynamic.partitions.pernode=100000;