Question

我遇到了一个奇怪的问题，我向你保证我已经google了很多。

我正在运行一组AWS Elastic MapReduce群集，我有一个包含大约16个分区的Hive表。它们是从emr-s3distcp创建的（因为原始s3存储桶中有大约216K个文件），带有--groupBy并且限制设置为64MiB（在这种情况下为DFS块大小），它们只是文本文件每条线上的json对象，使用JSON SerDe。

当我运行此脚本时，需要很长时间，然后由于某些IPC连接而放弃。

最初，从s3distcp到HDFS的压力如此之大，以至于我采取了一些措施（阅读：调整大容量机器，然后将dfs权限设置为3倍复制，因为它是一个小集群，块大小设置为64MiB）。这样做有效，未复制的块数变为零（EMR中默认值小于3，为2，但我已改为3）。

查看/mnt/var/log/apps/hive_081.log会产生如下所示的几行：

2013-05-12 09:56:12,120 DEBUG org.apache.hadoop.ipc.Client (Client.java:<init>(222)) - The ping interval is60000ms.
2013-05-12 09:56:12,120 DEBUG org.apache.hadoop.ipc.Client (Client.java:<init>(265)) - Use SIMPLE authentication for protocol ClientProtocol
2013-05-12 09:56:12,120 DEBUG org.apache.hadoop.ipc.Client (Client.java:setupIOstreams(551)) - Connecting to /10.17.17.243:9000
2013-05-12 09:56:12,121 DEBUG org.apache.hadoop.ipc.Client (Client.java:sendParam(769)) - IPC Client (47) connection to /10.17.17.243:9000 from hadoop sending #14
2013-05-12 09:56:12,121 DEBUG org.apache.hadoop.ipc.Client (Client.java:run(742)) - IPC Client (47) connection to /10.17.17.243:9000 from hadoop: starting, having connections 2
2013-05-12 09:56:12,125 DEBUG org.apache.hadoop.ipc.Client (Client.java:receiveResponse(804)) - IPC Client (47) connection to /10.17.17.243:9000 from hadoop got value #14
2013-05-12 09:56:12,126 DEBUG org.apache.hadoop.ipc.RPC (RPC.java:invoke(228)) - Call: getFileInfo 6
2013-05-12 09:56:21,523 INFO  org.apache.hadoop.ipc.Client (Client.java:handleConnectionFailure(663)) - Retrying connect to server: domU-12-31-39-10-81-2A.compute-1.internal/10.198.130.216:9000. Already tried 6 time(s).
2013-05-12 09:56:22,122 DEBUG org.apache.hadoop.ipc.Client (Client.java:close(876)) - IPC Client (47) connection to /10.17.17.243:9000 from hadoop: closed
2013-05-12 09:56:22,122 DEBUG org.apache.hadoop.ipc.Client (Client.java:run(752)) - IPC Client (47) connection to /10.17.17.243:9000 from hadoop: stopped, remaining connections 1
2013-05-12 09:56:42,544 INFO  org.apache.hadoop.ipc.Client (Client.java:handleConnectionFailure(663)) - Retrying connect to server: domU-12-31-39-10-81-2A.compute-1.internal/10.198.130.216:9000. Already tried 7 time(s).

依此类推，直到其中一个客户达到限制为止。

在Elastic MapReduce下如何在Hive中解决这个问题？

由于

Answer 1

过了一会儿，我注意到：违规的IP地址甚至不在我的集群中，所以它是一个卡住的hive Metastore。我通过以下方式解决了这个问题：

CREATE TABLE whatever_2 LIKE whatever LOCATION <hdfs_location>;

ALTER TABLE whetever_2 RECOVER PARTITIONS;

希望它有所帮助。

AWS Elastic MapReduce下的慢速Hive查询性能

1 个答案: