我正在使用CDH-5.14.2-1安装的Hive(带有Yarn),并制作了一个保存购买历史的数据库。一张具有购买历史记录的表具有10亿个元组。
我尝试了以下查询来衡量Hive的性能。
SELECT c.gender,
g.NAME,
i.NAME,
Sum(b.num)
FROM customers c
JOIN boughts_bil b
ON ( c.id = b.cus_id
AND b.id < $var )
JOIN items i
ON ( i.id = b.item_id )
JOIN genres g
ON ( g.id = i.gen_id )
GROUP BY c.gender,
g.NAME,
i.NAME;
顺便说一句,由于我不想进行任何优化,所以没有进行分区。
当我设置“ $ var = 30,000,000”时,发生错误“执行错误,从org.apache.hadoop.hive.ql.exe返回代码2”。实际上,我三个月前曾使用过相同的查询,但那时效果很好。
我检查了HistoryServer并写成如下
Diagnostics:
Application failed due to failed ApplicationMaster.
Only partial information is available; some values may be inaccurate.
Cloudera的计划进展顺利时是Express,但现在该计划仅限企业使用。是原因吗?
还是有其他原因导致内存不足错误。
请赐予你智慧。
谢谢。
添加
尽管我更改了$var = 50
,但任务尚未完成。我在前天设置了$var = 1,000,000
,任务已完全完成。
我认为原因不是数据或查询,而是服务器。
终端消息如下:
Query ID = ..._20180813111111_92d8a1f2-4614-49c6-8833-d7b2e709c79c
Total jobs = 2
Stage-1 is selected by condition resolver.
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 557
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1534123434864_0480, Tracking URL = http://...:8088/proxy/application_1534123434864_0480/
Kill Command = /.../hadoop job -kill job_1534123434864_0480
Hadoop job information for Stage-1: number of mappers: 140; number of reducers: 557
2018-08-13 11:11:49,795 Stage-1 map = 0%, reduce = 0%
2018-08-13 11:12:39,732 Stage-1 map = 3%, reduce = 0%, Cumulative CPU 159.56 sec
2018-08-13 11:12:40,808 Stage-1 map = 12%, reduce = 0%, Cumulative CPU 428.58 sec
2018-08-13 11:12:41,884 Stage-1 map = 20%, reduce = 0%, Cumulative CPU 649.73 sec
2018-08-13 11:12:42,965 Stage-1 map = 36%, reduce = 0%, Cumulative CPU 945.71 sec
2018-08-13 11:12:44,040 Stage-1 map = 51%, reduce = 0%, Cumulative CPU 1089.56 sec
2018-08-13 11:12:45,112 Stage-1 map = 56%, reduce = 0%, Cumulative CPU 1154.54 sec
2018-08-13 11:12:46,197 Stage-1 map = 58%, reduce = 0%, Cumulative CPU 1163.98 sec
2018-08-13 11:12:48,336 Stage-1 map = 60%, reduce = 0%, Cumulative CPU 1195.79 sec
2018-08-13 11:12:50,465 Stage-1 map = 62%, reduce = 0%, Cumulative CPU 1221.94 sec
2018-08-13 11:12:51,529 Stage-1 map = 65%, reduce = 0%, Cumulative CPU 1243.78 sec
2018-08-13 11:12:52,628 Stage-1 map = 68%, reduce = 0%, Cumulative CPU 1250.12 sec
2018-08-13 11:12:54,755 Stage-1 map = 69%, reduce = 0%, Cumulative CPU 1258.72 sec
2018-08-13 11:12:55,818 Stage-1 map = 73%, reduce = 0%, Cumulative CPU 1310.93 sec
2018-08-13 11:12:56,878 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 1402.61 sec
2018-08-13 11:12:57,936 Stage-1 map = 83%, reduce = 0%, Cumulative CPU 1440.37 sec
2018-08-13 11:12:58,994 Stage-1 map = 88%, reduce = 0%, Cumulative CPU 1514.14 sec
2018-08-13 11:13:00,049 Stage-1 map = 90%, reduce = 0%, Cumulative CPU 1545.1 sec
2018-08-13 11:13:02,163 Stage-1 map = 91%, reduce = 0%, Cumulative CPU 1603.52 sec
2018-08-13 11:13:03,228 Stage-1 map = 94%, reduce = 0%, Cumulative CPU 1657.94 sec
2018-08-13 11:13:04,283 Stage-1 map = 99%, reduce = 0%, Cumulative CPU 1717.53 sec
2018-08-13 11:13:05,339 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1730.17 sec
2018-08-13 11:13:11,744 Stage-1 map = 100%, reduce = 1%, Cumulative CPU 1752.35 sec
2018-08-13 11:13:13,882 Stage-1 map = 100%, reduce = 2%, Cumulative CPU 1755.7 sec
2018-08-13 11:13:14,947 Stage-1 map = 100%, reduce = 3%, Cumulative CPU 1772.27 sec
2018-08-13 11:13:16,005 Stage-1 map = 100%, reduce = 5%, Cumulative CPU 1818.92 sec
2018-08-13 11:13:17,067 Stage-1 map = 100%, reduce = 7%, Cumulative CPU 1846.94 sec
2018-08-13 11:13:19,191 Stage-1 map = 100%, reduce = 9%, Cumulative CPU 1885.14 sec
2018-08-13 11:13:20,251 Stage-1 map = 100%, reduce = 10%, Cumulative CPU 1909.41 sec
2018-08-13 11:13:21,312 Stage-1 map = 100%, reduce = 11%, Cumulative CPU 1922.64 sec
2018-08-13 11:13:25,546 Stage-1 map = 100%, reduce = 13%, Cumulative CPU 1956.43 sec
2018-08-13 11:13:26,614 Stage-1 map = 100%, reduce = 15%, Cumulative CPU 1995.36 sec
2018-08-13 11:13:27,683 Stage-1 map = 100%, reduce = 17%, Cumulative CPU 2027.25 sec
2018-08-13 11:13:28,749 Stage-1 map = 100%, reduce = 19%, Cumulative CPU 2066.51 sec
2018-08-13 11:13:29,819 Stage-1 map = 100%, reduce = 20%, Cumulative CPU 2093.91 sec
2018-08-13 11:13:30,884 Stage-1 map = 100%, reduce = 21%, Cumulative CPU 2100.15 sec
2018-08-13 11:13:31,947 Stage-1 map = 100%, reduce = 23%, Cumulative CPU 2136.57 sec
2018-08-13 11:13:33,017 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 2168.52 sec
2018-08-13 11:13:34,076 Stage-1 map = 100%, reduce = 27%, Cumulative CPU 2210.15 sec
2018-08-13 11:13:38,326 Stage-1 map = 100%, reduce = 28%, Cumulative CPU 2226.99 sec
2018-08-13 11:13:39,389 Stage-1 map = 100%, reduce = 29%, Cumulative CPU 2246.71 sec
2018-08-13 11:13:40,447 Stage-1 map = 100%, reduce = 31%, Cumulative CPU 2281.74 sec
2018-08-13 11:13:41,511 Stage-1 map = 100%, reduce = 33%, Cumulative CPU 2319.49 sec
2018-08-13 11:13:42,570 Stage-1 map = 100%, reduce = 35%, Cumulative CPU 2350.72 sec
2018-08-13 11:13:45,746 Stage-1 map = 100%, reduce = 36%, Cumulative CPU 2371.35 sec
2018-08-13 11:13:46,809 Stage-1 map = 100%, reduce = 37%, Cumulative CPU 2391.87 sec
2018-08-13 11:13:48,924 Stage-1 map = 100%, reduce = 39%, Cumulative CPU 2428.84 sec
2018-08-13 11:13:49,982 Stage-1 map = 100%, reduce = 41%, Cumulative CPU 2461.64 sec
2018-08-13 11:13:51,030 Stage-1 map = 100%, reduce = 42%, Cumulative CPU 2492.05 sec
2018-08-13 11:13:52,075 Stage-1 map = 100%, reduce = 43%, Cumulative CPU 2512.36 sec
2018-08-13 11:13:53,138 Stage-1 map = 100%, reduce = 46%, Cumulative CPU 2551.82 sec
2018-08-13 11:13:54,200 Stage-1 map = 100%, reduce = 48%, Cumulative CPU 2598.15 sec
2018-08-13 11:13:55,262 Stage-1 map = 100%, reduce = 50%, Cumulative CPU 2626.53 sec
2018-08-13 11:13:56,322 Stage-1 map = 100%, reduce = 51%, Cumulative CPU 2644.72 sec
2018-08-13 11:13:57,362 Stage-1 map = 100%, reduce = 52%, Cumulative CPU 2654.88 sec
2018-08-13 11:14:10,109 Stage-1 map = 100%, reduce = 53%, Cumulative CPU 2670.23 sec
2018-08-13 11:14:11,167 Stage-1 map = 100%, reduce = 54%, Cumulative CPU 2679.96 sec
2018-08-13 11:14:14,342 Stage-1 map = 100%, reduce = 56%, Cumulative CPU 2709.52 sec
2018-08-13 11:14:28,034 Stage-1 map = 100%, reduce = 57%, Cumulative CPU 2728.34 sec
2018-08-13 11:14:35,427 Stage-1 map = 100%, reduce = 58%, Cumulative CPU 2747.36 sec
2018-08-13 11:14:39,652 Stage-1 map = 100%, reduce = 59%, Cumulative CPU 2772.93 sec
2018-08-13 11:14:41,763 Stage-1 map = 100%, reduce = 60%, Cumulative CPU 2788.89 sec
2018-08-13 11:14:48,042 Stage-1 map = 100%, reduce = 61%, Cumulative CPU 2813.88 sec
2018-08-13 11:14:49,097 Stage-1 map = 100%, reduce = 62%, Cumulative CPU 2826.24 sec
2018-08-13 11:14:53,335 Stage-1 map = 100%, reduce = 63%, Cumulative CPU 2847.18 sec
2018-08-13 11:14:56,501 Stage-1 map = 100%, reduce = 64%, Cumulative CPU 2868.39 sec
2018-08-13 11:14:58,614 Stage-1 map = 100%, reduce = 65%, Cumulative CPU 2889.34 sec
2018-08-13 11:14:59,673 Stage-1 map = 100%, reduce = 66%, Cumulative CPU 2889.52 sec
2018-08-13 11:15:01,785 Stage-1 map = 100%, reduce = 67%, Cumulative CPU 2903.53 sec
2018-08-13 11:15:02,844 Stage-1 map = 100%, reduce = 68%, Cumulative CPU 2909.72 sec
2018-08-13 11:15:03,903 Stage-1 map = 100%, reduce = 69%, Cumulative CPU 2915.11 sec
2018-08-13 11:15:04,962 Stage-1 map = 100%, reduce = 70%, Cumulative CPU 2939.29 sec
2018-08-13 11:15:06,022 Stage-1 map = 100%, reduce = 73%, Cumulative CPU 2991.99 sec
2018-08-13 11:15:07,088 Stage-1 map = 100%, reduce = 74%, Cumulative CPU 3008.11 sec
2018-08-13 11:15:08,147 Stage-1 map = 100%, reduce = 75%, Cumulative CPU 3023.01 sec
2018-08-13 11:15:09,196 Stage-1 map = 100%, reduce = 76%, Cumulative CPU 3029.96 sec
2018-08-13 11:15:12,359 Stage-1 map = 100%, reduce = 77%, Cumulative CPU 3053.28 sec
2018-08-13 11:15:14,471 Stage-1 map = 100%, reduce = 78%, Cumulative CPU 3074.76 sec
2018-08-13 11:15:16,585 Stage-1 map = 100%, reduce = 79%, Cumulative CPU 3087.69 sec
2018-08-13 11:15:18,709 Stage-1 map = 100%, reduce = 80%, Cumulative CPU 3104.28 sec
2018-08-13 11:15:20,824 Stage-1 map = 100%, reduce = 81%, Cumulative CPU 3126.94 sec
2018-08-13 11:15:21,931 Stage-1 map = 100%, reduce = 83%, Cumulative CPU 3166.12 sec
2018-08-13 11:15:22,979 Stage-1 map = 100%, reduce = 85%, Cumulative CPU 3209.21 sec
2018-08-13 11:15:24,039 Stage-1 map = 100%, reduce = 87%, Cumulative CPU 3245.82 sec
2018-08-13 11:15:25,096 Stage-1 map = 100%, reduce = 88%, Cumulative CPU 3259.57 sec
2018-08-13 11:15:27,211 Stage-1 map = 100%, reduce = 89%, Cumulative CPU 3275.9 sec
2018-08-13 11:15:29,326 Stage-1 map = 100%, reduce = 90%, Cumulative CPU 3291.91 sec
2018-08-13 11:15:30,386 Stage-1 map = 100%, reduce = 91%, Cumulative CPU 3318.16 sec
2018-08-13 11:15:31,441 Stage-1 map = 100%, reduce = 93%, Cumulative CPU 3357.23 sec
2018-08-13 11:15:32,496 Stage-1 map = 100%, reduce = 95%, Cumulative CPU 3382.19 sec
2018-08-13 11:15:33,548 Stage-1 map = 100%, reduce = 96%, Cumulative CPU 3407.15 sec
2018-08-13 11:15:34,598 Stage-1 map = 100%, reduce = 97%, Cumulative CPU 3419.89 sec
2018-08-13 11:15:37,755 Stage-1 map = 100%, reduce = 98%, Cumulative CPU 3442.94 sec
2018-08-13 11:15:39,871 Stage-1 map = 100%, reduce = 99%, Cumulative CPU 3449.41 sec
2018-08-13 11:15:45,128 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3475.74 sec
MapReduce Total cumulative CPU time: 57 minutes 55 seconds 740 msec
Ended Job = job_1534123434864_0480
Execution log at: /.../..._20180813111111_92d8a1f2-4614-49c6-8833-d7b2e709c79c.log
2018-08-13 11:15:51 Starting to launch local task to process map join; maximum memory = 1908932608
2018-08-13 11:15:52 Dump the side-table for tag: 1 with group count: 24 into file: file:/.../c33533aa-7637-4034-a3d1-2e8b857c2820/hive_2018-08-13_11-11-38_070_2752807246292956243-1/-local-10006/HashTable-Stage-4/MapJoin-mapfile01--.hashtable
2018-08-13 11:15:52 Uploaded 1 File to: file:/.../c33533aa-7637-4034-a3d1-2e8b857c2820/hive_2018-08-13_11-11-38_070_2752807246292956243-1/-local-10006/HashTable-Stage-4/MapJoin-mapfile01--.hashtable (902 bytes)
2018-08-13 11:15:52 Dump the side-table for tag: 1 with group count: 3500 into file: file:/.../c33533aa-7637-4034-a3d1-2e8b857c2820/hive_2018-08-13_11-11-38_070_2752807246292956243-1/-local-10006/HashTable-Stage-4/MapJoin-mapfile11--.hashtable
2018-08-13 11:15:52 Uploaded 1 File to: file:/.../c33533aa-7637-4034-a3d1-2e8b857c2820/hive_2018-08-13_11-11-38_070_2752807246292956243-1/-local-10006/HashTable-Stage-4/MapJoin-mapfile11--.hashtable (107794 bytes)
2018-08-13 11:15:52 End of local task; Time Taken: 1.54 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 2 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1534123434864_0536, Tracking URL = http://...:8088/proxy/application_1534123434864_0536/
Kill Command = /.../hadoop job -kill job_1534123434864_0536
Hadoop job information for Stage-4: number of mappers: 4; number of reducers: 1
2018-08-13 11:16:23,048 Stage-4 map = 0%, reduce = 0%
2018-08-13 11:16:44,240 Stage-4 map = 25%, reduce = 0%, Cumulative CPU 2.28 sec
2018-08-13 11:16:46,330 Stage-4 map = 50%, reduce = 0%, Cumulative CPU 5.06 sec
2018-08-13 11:16:49,473 Stage-4 map = 75%, reduce = 0%, Cumulative CPU 9.58 sec
2018-08-13 11:16:50,520 Stage-4 map = 100%, reduce = 0%, Cumulative CPU 15.14 sec
2018-08-13 11:17:12,471 Stage-4 map = 0%, reduce = 0%
2018-08-13 11:17:42,680 Stage-4 map = 25%, reduce = 0%, Cumulative CPU 2.2 sec
2018-08-13 11:17:44,779 Stage-4 map = 50%, reduce = 0%, Cumulative CPU 5.25 sec
2018-08-13 11:17:46,873 Stage-4 map = 100%, reduce = 0%, Cumulative CPU 15.0 sec
2018-08-13 11:18:12,006 Stage-4 map = 0%, reduce = 0%
MapReduce Total cumulative CPU time: 15 seconds 0 msec
Ended Job = job_1534123434864_0536 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 140 Reduce: 557 Cumulative CPU: 3475.74 sec HDFS Read: 37355213704 HDFS Write: 56143 SUCCESS
Stage-Stage-4: Map: 4 Reduce: 1 Cumulative CPU: 15.0 sec HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 58 minutes 10 seconds 740 msec
WARN: The method class org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked.
WARN: Please see http://www.slf4j.org/codes.html#release for an explanation.
然后我检查了yarn logs -applicationId application_1534123434864_0480
,发现container_1534123434864_0480_02_000001
中存在一些错误。
(1)ERROR [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
Container complete event for unknown container container_1534123434864_0480_02_000143
(2)INFO [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1534123434864_0480_r_000014_1000:
Container killed on request. Exit code is 137
Container exited with a non-zero exit code 137
Killed by external signal
(3)INFO [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
Diagnostics report from attempt_1534123434864_0480_r_000041_1000:
Container exited with a non-zero exit code 154
(4)ERROR [ContainerLauncher #1]
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl:
Container launch failed for container_1534123434864_0480_02_000241 :
java.io.IOException: Failed on local exception: java.io.IOException: java.io.IOException:
Connection reset from partner; Host Details : local host is: "node3"; destination host is: "node2":8041;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1508)
at org.apache.hadoop.ipc.Client.call(Client.java:1441)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy40.startContainers(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:96)
at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy41.startContainers(Unknown Source)
at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:155)
at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:379)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: java.io.IOException: Connection reset from partner
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:718)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:681)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:769)
at org.apache.hadoop.ipc.Client$Connection.access$3000(Client.java:396)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1557)
at org.apache.hadoop.ipc.Client.call(Client.java:1480)
... 15 more
Caused by: java.io.IOException: Connection reset from partner
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57)
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:370)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:594)
at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:396)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:761)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:757)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:756)
... 18 more
这些错误已多次显示。我以为node2服务器出了问题。
答案 0 :(得分:0)
我通过参考cloudera manager uninstall卸载Cloudera Manager。
然后,我再次安装Cloudera Manager。蜂巢效果很好。