在纱线客户端模式下运行的Spark(状态:ACCEPTED)在Spark Submit(在YARN上使用Spark 1.6.1)结束时失败

时间:2016-07-13 20:13:41

标签: hadoop apache-spark yarn

我试图在yarn-client模式下对Spark执行以下查询。

$SPARK_HOME/bin/spark-submit --class org.apache.spark.examples.SparkPi     --master yarn     --deploy-mode client         $SPARK_HOME/examples/target/scala-2.10/spark-examples*.jar     10

当我执行上述查询时,我的应用程序停留在

之后
  

16/07/13 17:14:28 INFO yarn.Client:application_1468428769910_0002的应用报告(状态:已接受)

     

16/07/13 17:14:28 INFO yarn.Client:            客户端令牌:N / A.            诊断:N / A.            ApplicationMaster主机:N / A.            ApplicationMaster RPC端口:-1            队列:默认            开始时间:1468430067384            最终状态:未定            跟踪网址:http://hadoop-master:8088/proxy/application_1468428769910_0002/            用户:nachiket

     

16/07/13 17:14:29 INFO yarn.Client:application_1468428769910_0002的应用报告(状态:已接受)

     

16/07/13 17:14:30 INFO yarn.Client:application_1468428769910_0002的应用报告(状态:已接受)

     

16/07/13 17:14:31 INFO yarn.Client:application_1468428769910_0002的应用报告(状态:已接受)

     

16/07/13 17:14:32 INFO yarn.Client:application_1468428769910_0002的应用报告(状态:已接受)

我已经实施了以下链接中提到的大部分建议:

Application report for application_ (state: ACCEPTED) never ends for Spark Submit (with Spark 1.2.0 on YARN)

我仍面临同样的问题。除上述链接之外还有其他解决方案吗?

最后,Job失败并带有以下子句

    client token: N/A
         diagnostics: Application application_1468455134412_0001 failed 2 times due to Error launching appattempt_1468455134412_0001_000002. Got exception: org.apache.hadoop.net.ConnectTimeoutException: Call From sclab103/104.239.213.7 to 104.239.213.7:60640 failed on socket timeout exception: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=104.239.213.7/104.239.213.7:60640]; For more details see:  http://wiki.apache.org/hadoop/SocketTimeout
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:751)
        at org.apache.hadoop.ipc.Client.call(Client.java:1479)
        at org.apache.hadoop.ipc.Client.call(Client.java:1412)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
        at com.sun.proxy.$Proxy82.startContainers(Unknown Source)
        at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:96)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy83.startContainers(Unknown Source)
        at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118)
        at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:250)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=104.239.213.7/104.239.213.7:60640]
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:534)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
        at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
        at org.apache.hadoop.ipc.Client.call(Client.java:1451)
        ... 16 more
. Failing the application.
         ApplicationMaster host: N/A
         ApplicationMaster RPC port: -1
         queue: default
         start time: 1468455280498
         final status: FAILED
         tracking URL: http://hadoop-master:8088/cluster/app/application_1468455134412_0001
         user: sclab

0 个答案:

没有答案