我在MAC中运行Bash脚本。此脚本调用以Scala语言编写的spark方法很多次。我目前正在尝试使用for循环调用此spark方法100,000次。
在运行少量迭代(大约3000次迭代)后,代码退出时出现以下异常。
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10 seconds]. This timeout is controlled by spark.executor.heartbeatInterval
at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:518)
at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:547)
at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:547)
at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:547)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1877)
at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:547)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
Exception in thread "dag-scheduler-event-loop" 16/11/22 13:37:32 WARN NioEventLoop: Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: Java heap space
at io.netty.util.internal.MpscLinkedQueue.offer(MpscLinkedQueue.java:126)
at io.netty.util.internal.MpscLinkedQueue.add(MpscLinkedQueue.java:221)
at io.netty.util.concurrent.SingleThreadEventExecutor.fetchFromScheduledTaskQueue(SingleThreadEventExecutor.java:259)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:346)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
java.lang.OutOfMemoryError: Java heap space
at java.util.regex.Pattern.compile(Pattern.java:1047)
at java.lang.String.replace(String.java:2180)
at org.apache.spark.util.Utils$.getFormattedClassName(Utils.scala:1728)
at org.apache.spark.storage.RDDInfo$$anonfun$1.apply(RDDInfo.scala:57)
at org.apache.spark.storage.RDDInfo$$anonfun$1.apply(RDDInfo.scala:57)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.storage.RDDInfo$.fromRdd(RDDInfo.scala:57)
at org.apache.spark.scheduler.StageInfo$$anonfun$1.apply(StageInfo.scala:87)
有人可以帮忙吗,这个错误是由于大量调用spark方法引起的吗?
答案 0 :(得分:14)
其RpcTimeoutException
..所以spark.network.timeout
(spark.rpc.askTimeout
)可以使用大于默认值来调整,以便处理复杂的工作负载。您可以从这些值开始,并根据您的工作负载进行相应调整。
请参阅latest
spark.network.timeout
120s所有网络的默认超时 互动。将使用此配置代替 spark.core.connection.ack.wait.timeout, spark.storage.blockManagerSlaveTimeoutMs, spark.shuffle.io.connectionTimeout,spark.rpc.askTimeout或 spark.rpc.lookupTimeout如果没有配置它们。
还要考虑增加执行程序内存,即spark.executor.memory
,最重要的是检查您的代码,检查是否可以进一步优化。
set by SparkConf: conf.set("spark.network.timeout", "600s")
set by spark-defaults.conf: spark.network.timeout 600s
set when calling spark-submit: --conf spark.network.timeout=600s
答案 1 :(得分:4)
上面的堆栈跟踪也显示了java堆空间的OOM错误,所以一旦尝试增加内存并运行它并关于超时它的rpc超时,这样你可以设置 spark.network.timeout
超时价值根据您的需要......
答案 2 :(得分:1)
请增加执行程序内存,以便OOM消失,然后在代码中创建chnage,以便RDD
不会有大内存占用。
- executer-memory = 3G
答案 3 :(得分:1)
由于执行程序的内存,您正在看到此问题。 尝试将内存增加到(x 2),以使容器在等待其余容器时不会超时。
答案 4 :(得分:0)
只需将spark.executor.heartbeatInterval
增加到20秒即可。错误说明了这一点。