Spark + Mesos集群模式,谁上传jar?

时间:2015-11-29 02:16:16

标签: apache-spark mesos

我正在尝试使用Mesos群集模式运行Spark应用程序。 (我有客户端模式工作,但仍想尝试群集模式)

我在Mesos主节点上启动了spark-mesos-dispatcher

当我使用以下命令在本地路径/tmp/assembly.jar提交程序集时,

bin/spark-submit --master mesos://dispatcher:7077 --deploy-mode cluster --class com.example.Example /tmp/assembly.jar

它失败,因为mesos从属节点上不存在文件/tmp/assembly.jar

I1129 10:47:43.839771  5884 fetcher.cpp:414] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/9d725348-931a-48fb-96f7-d29a4b09f3e8-S9\/deploy","items":[{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"\/tmp\/assembly.jar"}}],"sandbox_directory":"\/var\/lib\/mesos\/slaves\/9d725348-931a-48fb-96f7-d29a4b09f3e8-S9\/frameworks\/9d725348-931a-48fb-96f7-d29a4b09f3e8-0291\/executors\/driver-20151129104742-0008\/runs\/31bf5840-226e-4b87-ae76-d14bd2f17950","user":"user"}
I1129 10:47:43.840710  5884 fetcher.cpp:369] Fetching URI '/tmp/assembly.jar'
I1129 10:47:43.840721  5884 fetcher.cpp:243] Fetching directly into the sandbox directory
I1129 10:47:43.840731  5884 fetcher.cpp:180] Fetching URI '/tmp/assembly.jar'
I1129 10:47:43.840737  5884 fetcher.cpp:160] Copying resource with command:cp '/tmp/assembly.jar' '/var/lib/mesos/slaves/9d725348-931a-48fb-96f7-d29a4b09f3e8-S9/frameworks/9d725348-931a-48fb-96f7-d29a4b09f3e8-0291/executors/driver-20151129104742-0008/runs/31bf5840-226e-4b87-ae76-d14bd2f17950/assembly.jar'
cp: cannot stat `/tmp/assembly.jar': No such file or directory
Failed to fetch '/tmp/assembly.jar': Failed to copy with command 'cp '/tmp/assembly.jar' '/var/lib/mesos/slaves/9d725348-931a-48fb-96f7-d29a4b09f3e8-S9/frameworks/9d725348-931a-48fb-96f7-d29a4b09f3e8-0291/executors/driver-20151129104742-0008/runs/31bf5840-226e-4b87-ae76-d14bd2f17950/assembly.jar'', exit status: 256
Failed to synchronize with slave (it's probably exited)

如果是YARN群集模式Spark's YARN client implementation will upload the application jar to HDFS so that the driver and all executors have access to the jar,但我在RestSubmissionClient中找不到这样的代码,这些代码由Mesos或Standalond群集模式使用。

在这种情况下,谁上传?或者我是否需要手动将应用程序程序集放在可通过HTTP URI访问的位置?

1 个答案:

答案 0 :(得分:0)

根据我的理解,您可以使用SparkContext addJar()方法添加本地(到驱动程序应用程序)JAR文件路径,然后将其分发到执行程序节点(在客户端模式下)。

当您声明要使用群集模式时,我建议您查看Spark Jobserver项目,该项目应该比使用内置工具更容易在Mesos上运行Spark应用程序