使用纱线的群集模式中的Spark Pi示例:关联丢失

时间:2015-04-08 10:48:01

标签: hadoop apache-spark yarn

我有三个虚拟机作为分布式Spark集群运行。我使用Spark 1.3.0和底层的Hadoop 2.6.0。

如果我运行Spark Pi示例

/usr/local/spark130/bin/spark-submit 
--class org.apache.spark.examples.SparkPi  
--master yarn-client /usr/local/spark130/examples/target/spark-examples_2.10-1.3.0.jar  10000

我收到此警告/错误,最终出现异常:

 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/04/08 12:37:06 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM@virtm4:47128] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
15/04/08 12:37:12 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM@virtm4:45975] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
15/04/08 12:37:13 ERROR YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!

当我检查容器的日志时,我发现它是SIGTERM-ed

15/04/08 12:37:08 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1>)
15/04/08 12:37:08 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1>)
15/04/08 12:37:08 INFO yarn.ApplicationMaster: Started progress reporter thread - sleep time : 5000
15/04/08 12:37:12 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM
15/04/08 12:37:12 INFO yarn.ApplicationMaster: Final app status: UNDEFINED, exitCode: 0, (reason: Shutdown hook called before final status was reported.)
15/04/08 12:37:12 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with UNDEFINED (diag message: Shutdown hook called before final status was reported.)

解: 我解决了这个问题。我现在使用Java7而不是Java8。这种情况被报告为bug,但它被拒绝了https://issues.apache.org/jira/browse/SPARK-6388 然而,改变Java版本确实有效。

2 个答案:

答案 0 :(得分:4)

由于Java 8过多的内存分配问题,关联可能会丢失:https://issues.apache.org/jira/browse/YARN-4714

您可以通过在yarn-site.xml

中设置以下属性来强制YARN忽略此项
Symfony\Component\Security\Core\Exception\AuthenticationCredentialsNotFoundException: The security context contains no authentication token. One possible reason may be that there is no firewall configured for this URL.

答案 1 :(得分:0)

之前我遇到过类似的问题,直到找到issue

尝试显式停止SparkContext实例sc.stop()