我已经部署了一个独立的Spark集群,其中包含一个驱动程序和2个执行程序,每个执行程序都在单独的计算机上运行。
每当我使用spark-submit --master spark://driver_ip:7077 example/src/main/python/pi.py
向母版提交作业时,它就会无限运行并显示以下错误:
BlockManagerMaster:54 - Removal of executor 50 requested
CoarseGrainedSchedulerBackend$DriverEndpoint:54 - Asked to remove non-existent executor 50
BlockManagerMasterEndpoint:54 - Trying to remove executor 50 from BlockManagerMaster.
StandaloneAppClient$ClientEndpoint:54 - Executor updated: app-20181129123913-0003/52 is now RUNNING
StandaloneAppClient$ClientEndpoint:54 - Executor updated: app-20181129123913-0003/51 is now EXITED (Command exited with code 1)
StandaloneSchedulerBackend:54 - Executor app-20181129123913-0003/51 removed: Command exited with code 1
StandaloneAppClient$ClientEndpoint:54 - Executor added: app-20181129123913-0003/53 on worker-20181129120029-10.0.1.101-36599 (10.0.1.101:36599) with 1 core(s)
每次Removal of executor
中的数字都会增加,并且程序不会结束。看来执行者一直在拒绝工作。
谁能帮我解决这个问题。
请注意,我可以看到Spark执行程序已在Spark Manager的Web UI中向驱动程序注册。