我在saveAsTable时遇到错误。 这是我的代码
val df = spark.read.jdbc(url,table,"id",0,100000000,4,properties)
df.write.saveAsTable("custom_order_1kw")
“custoom_order_1kw”是Mysql中的一个表,它是700 + MB。
错误日志:
WARN spark.HeartbeatReceiver: Removing executor 10 with no recent heartbeats: 166323 ms exceeds timeout 120000 ms
17/04/12 15:55:15 ERROR scheduler.TaskSchedulerImpl: Lost executor 10 on 172.21.102.93: Executor heartbeat timed out after 166323 ms
17/04/12 15:55:15 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 172.21.102.93): ExecutorLostFailure (executor 10 exited caused by one of the running tasks) Reason: Executor heartbeat timed out after 166323 ms
17/04/12 15:55:25 ERROR scheduler.TaskSchedulerImpl: Lost executor 10 on 172.21.102.93: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
发生4次相同错误后,任务中止。
我使用spark-shell来测试代码。
spark-shell --master spark://172.21.102.93:7077 --executor-memory 4g --driver-cores 1 --executor-cores 1 --driver-memory 8g
如果我选择一个较小的表来提取(200 + MB),一切都正确!
有什么错误的想法吗?
答案 0 :(得分:0)
火花defaults.conf 将spark.network.timeout设置为更高的值默认值为120
您也可以在命令中添加相同的
spark-submit --conf spark.network.timeout 10000000 --class myclass.neuralnet.TrainNetSpark --master spark://master.cluster:7077 --driver-memory 30G --executor-memory 14G --num-executors 7 --executor-cores 8 --conf spark.driver.maxResultSize=4g --conf spark.executor.heartbeatInterval=10000000 path/to/my.jar