不稳定的执行程序在spark独立集群中一次又一次地重新连接?

时间:2017-10-28 06:30:34

标签: apache-spark apache-spark-standalone

我在堆栈跟踪之下,执行程序丢失并创建新的执行程序连接。

INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(etlspark); groups with view permissions: Set(); users  with modify permissions: Set(sss); groups with modify permissions: Set()
**java.lang.IllegalArgumentException: requirement failed: TransportClient has not yet been set.**
    at scala.Predef$.require(Predef.scala:224)
    at org.apache.spark.rpc.netty.RpcOutboxMessage.onTimeout(Outbox.scala:70)
    at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$ask$1.applyOrElse(NettyRpcEnv.scala:232)
    at java.lang.Thread.run(Thread.java:745)
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713)
    at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:284)
    at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.rpc.RpcTimeoutException: Cannot receive any reply in 10 seconds. This timeout is controlled by spark.rpc.askTimeout
    at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
    at scala.util.Failure$$anonfun$recover$1.apply(Try.scala:216)
    at scala.util.Try$.apply(Try.scala:192)

这种堆栈跟踪的原因是什么?是因为master和slave机器java版本不同或者这与集群配置有关。请引导我,因为这个错误信息来自哪个?

0 个答案:

没有答案