我已经将Spark 2.4.0集群部署在三台机器(Ubuntu Server 18.04(仿生海狸)),一个主机和两个从机之间,并且它们之间已成功建立连接。
我想在集群模式下通过spark-submit运行一个作业(在这种情况下为Java程序),但是该作业无法执行。
该程序在主服务器上以本地模式运行时工作正常,并为我提供了所需的输出,因此程序不是问题。
我正在使用以下命令启动作业:
spark-submit --class path.to.my.main.class.App --master spark://192.168.0.2:7077 --deploy-mode cluster MyProgram.jar
在这里显示日志文件:
主日志
2019-03-26 13:04:34 INFO Master:54 - Driver submitted org.apache.spark.deploy.worker.DriverWrapper
2019-03-26 13:04:34 INFO Master:54 - Launching driver driver-20190326130434-0003 on worker worker-20190326121118-192.168.0.4-38962
2019-03-26 13:04:36 INFO Master:54 - Removing driver: driver-20190326130434-0003
2019-03-26 13:04:39 INFO Master:54 - 192.168.0.2:49178 got disassociated, removing it.
2019-03-26 13:04:39 INFO Master:54 - 192.168.0.2:36778 got disassociated, removing it.
工人日志:
2019-03-26 13:04:34 INFO Worker:54 - Asked to launch driver driver-20190326130434-0003
2019-03-26 13:04:34 INFO DriverRunner:54 - Copying user jar file:/home/spark/MyProgram.jar to /home/spark/spark-2.4.0-bin-hadoop2.7/work/driver-20190326130434-0003/MyProgram.jar
2019-03-26 13:04:34 INFO Utils:54 - Copying /home/spark/MyProgram.jar to /home/spark/spark-2.4.0-bin-hadoop2.7/work/driver-20190326130434-0003/MyProgram.jar
2019-03-26 13:04:34 INFO DriverRunner:54 - Launch Command: "/usr/lib/jvm/java-8-oracle//bin/java" "-cp" "/home/spark/spark-2.4.0-bin-hadoop2.7//conf/:/home/spark/spark-2.4.0-bin-hadoop2.7/jars/*" "-Xmx1024M" "-Dspark.driver.supervise=false" "-Dspark.submit.deployMode=cluster" "-Dspark.jars=file:/home/spark/MyProgram.jar" "-Dspark.master=spark://192.168.0.2:7077" "-Dspark.app.name=path.to.my.main.class.App" "-Dspark.rpc.askTimeout=10s" "org.apache.spark.deploy.worker.DriverWrapper" "spark://Worker@192.168.0.4:38962" "/home/spark/spark-2.4.0-bin-hadoop2.7/work/driver-20190326130434-0003/MyProgram.jar" "path.to.my.main.class.App"
2019-03-26 13:04:36 WARN Worker:66 - Driver driver-20190326130434-0003 exited with failure