当我们在hive表上运行聚合查询时,它会失败并出现以下异常。但是select * from table
工作正常。
我们正在使用Apache Hadoop 2.7.2
,Hive 1.2.1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
java.lang.RuntimeException: Error caching map.xml: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/hive/dssbp/eb828a68-6637-41ea-a05e-d33ae658eb19/hive_2016-08-30_16-41-13_054_4952687673775889960-1/-mr-10004/a48247fe-12cd-4c9f-bf41-df67ffada26d/map.xml could only be replicated to 0 nodes instead of minReplication (=1). There are 2 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1547)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3107)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3031)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:724)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
答案 0 :(得分:0)
当您选择数据时,配置单元不需要存储中间结果。它可以直接生成结果。
如果您要进行任何类型的聚合,则需要将其工作存储在某处。
错误的相关部分是:
只能复制到0个节点而不是minReplication(= 1)。 有2个datanode正在运行,并且没有节点被排除在此 操作
这意味着hive无法将其块写入任何地方。可能是因为群集已满(但可能是因为它没有权利)。