Spark Cluster:初始作业不接受任何资源,执行者继续退出

时间:2018-04-16 13:38:00

标签: apache-spark cluster-computing exit executor

我在两个实例中使用了云资源的spark集群。一个是主人,一个是工人。总资源为4核和10G内存。 我可以启动shell,工人可以成功注册。但是当我运行简单的代码时。

shell的错误是: Spark版本:2.3.0 系统:CentOS v7 防火墙已停止。

这是配置:

export JAVA_HOME=/usr/java/jdk1.8.0_144
export SPARK_MASTER_IP=IP
export PYSPARK_PYTHON=/opt/anaconda3/bin/python
export SPARK_WORKER_MEMORY=2g
export SPARK_WORK_INSTANCES=1
export SPARK_WORkER_CORES=4
export SPARK_EXECUTOR_MEMORY=1g

我使用三台物理机器使用类似配置设置了另一个火花集群,但它们运行良好。一开始我得到了同样的错误,但我通过阻止防火墙解决了这个问题。我想在云上设置集群,不幸的是我遇到了同样的错误,但没有使用相同的解决方案解决它。我很好奇是否是端口问题,因为我只在http 80,4040,6066,7077,8080,8081,8787上打开端口。

这是错误:

enter image description here

以下是日志:

主日志:

2018-04-12 13:09:14 INFO  Master:54 - Registering app Spark shell
2018-04-12 13:09:14 INFO  Master:54 - Registered app Spark shell with ID              app-20180412130914-0000
2018-04-12 13:09:14 INFO  Master:54 - Launching executor     app-20180412130914-0000/0 on worker worker-20180411144020-192.**.**.**-44986
2018-04-12 13:11:15 INFO  Master:54 - Removing executor app-20180412130914-0000/0 because it is EXITED
2018-04-12 13:11:15 INFO  Master:54 - Launching executor app-20180412130914-0000/1 on worker worker-20180411144020-192.**.**.**-44986
2018-04-12 13:13:16 INFO  Master:54 - Removing executor app-20180412130914-0000/1 because it is EXITED
2018-04-12 13:13:16 INFO  Master:54 - Launching executor app-20180412130914-0000/2 on worker worker-20180411144020-192.**.**.**-44986
2018-04-12 13:15:17 INFO  Master:54 - Removing executor app-20180412130914-0000/2 because it is EXITED
2018-04-12 13:15:17 INFO  Master:54 - Launching executor app-20180412130914-0000/3 on worker worker-20180411144020-192.**.**.**-44986
2018-04-12 13:16:15 INFO  Master:54 - Removing app app-20180412130914-0000
2018-04-12 13:16:15 INFO  Master:54 - 192.**.**.**:39766 got disassociated, removing it.
2018-04-12 13:16:15 INFO  Master:54 - IP:39928 got disassociated, removing it.
2018-04-12 13:16:15 WARN  Master:66 - Got status update for unknown executor app-20180412130914-0000/3

工人日志:

2018-04-12 13:09:12 INFO  Worker:54 - Asked to launch executor    app-20180412130914-0000/0 for Spark shell
2018-04-12 13:09:12 INFO  SecurityManager:54 - Changing view acls to: root
2018-04-12 13:09:12 INFO  SecurityManager:54 - Changing modify acls to: root
2018-04-12 13:09:12 INFO  SecurityManager:54 - Changing view acls groups to: 
2018-04-12 13:09:12 INFO  SecurityManager:54 - Changing modify acls groups  to: 
2018-04-12 13:09:12 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions:    Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
2018-04-12 13:09:12 INFO  ExecutorRunner:54 - Launch command: "/usr/java/jdk1.8.0_144/bin/java" "-cp" "/opt/spark-2.3.0-bin-hadoop2.7/conf/:/opt/spark-2.3.0-bin-hadoop2.7/jars/*" "-Xmx1024M" "-Dspark.driver.port=39928" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@IP:39928" "--executor-id" "0" "--hostname" "192.**.**.**" "--cores" "4" "--app-id" "app-20180412130914-0000" "--worker-url" "spark://Worker@192.**.**.**:44986"
2018-04-12 13:11:13 INFO  Worker:54 - Executor app-20180412130914-0000/0 finished with state EXITED message Command exited with code 1 exitStatus 1
2018-04-12 13:11:13 INFO  Worker:54 - Asked to launch executor app-20180412130914-0000/1 for Spark shell
2018-04-12 13:11:13 INFO  SecurityManager:54 - Changing view acls to: root
2018-04-12 13:11:13 INFO  SecurityManager:54 - Changing modify acls to: root
2018-04-12 13:11:13 INFO  SecurityManager:54 - Changing view acls groups to: 
2018-04-12 13:11:13 INFO  SecurityManager:54 - Changing modify acls groups to: 
2018-04-12 13:11:13 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
2018-04-12 13:11:13 INFO  ExecutorRunner:54 - Launch command: "/usr/java/jdk1.8.0_144/bin/java" "-cp" "/opt/spark-2.3.0-bin-hadoop2.7/conf/:/opt/spark-2.3.0-bin-hadoop2.7/jars/*" "-Xmx1024M" "-Dspark.driver.port=39928" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@spark-master.novalocal:39928" "--executor-id" "1" "--hostname" "192.**.**.**" "--cores" "4" "--app-id" "app-20180412130914-0000" "--worker-url" "spark://Worker@192.**.**.**:44986"
2018-04-12 13:13:15 INFO  Worker:54 - Executor app-20180412130914-0000/1 finished with state EXITED message Command exited with code 1 exitStatus 1
2018-04-12 13:13:15 INFO  Worker:54 - Asked to launch executor app-20180412130914-0000/2 for Spark shell
2018-04-12 13:13:15 INFO  SecurityManager:54 - Changing view acls to: root
2018-04-12 13:13:15 INFO  SecurityManager:54 - Changing modify acls to: root
2018-04-12 13:13:15 INFO  SecurityManager:54 - Changing view acls groups to: 
2018-04-12 13:13:15 INFO  SecurityManager:54 - Changing modify acls groups to: 
2018-04-12 13:13:15 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
2018-04-12 13:13:15 INFO  ExecutorRunner:54 - Launch command: "/usr/java/jdk1.8.0_144/bin/java" "-cp" "/opt/spark-2.3.0-bin-hadoop2.7/conf/:/opt/spark-2.3.0-bin-hadoop2.7/jars/*" "-Xmx1024M" "-Dspark.driver.port=39928" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@spark-master.novalocal:39928" "--executor-id" "2" "--hostname" "192.**.**.**" "--cores" "4" "--app-id" "app-20180412130914-0000" "--worker-url" "spark://Worker@192.**.**.**:44986"
2018-04-12 13:15:16 INFO  Worker:54 - Executor app-20180412130914-0000/2 finished with state EXITED message Command exited with code 1 exitStatus 1

0 个答案:

没有答案