Mesos上的Spark - 无法获取二进制文件

时间:2015-05-04 03:11:49

标签: apache-spark mesos

尝试在Mesos群集上运行Spark作业。尝试获取二进制文件时出错。

试图保持二进制:

  1. HDFS
  2. 奴隶上的本地文件系统。
  3. 使用SPARK_EXECUTOR_URI

    的以下路径

    文件系统路径 - file://home/labadmin/spark-1.2.1.tgz

    I0501 10:27:19.302435 30510 fetcher.cpp:214] Fetching URI 'file://home/labadmin/spark-1.2.1.tgz'
    Failed to fetch: file://home/labadmin/spark-1.2.1.tgz
    Failed to synchronize with slave (it's probably exited)
    

    没有端口的HDFS路径 - hdfs://ipaddress/spark/spark-1.2.1.tgz

    0427 09:23:21.616092  4842 fetcher.cpp:214] Fetching URI 'hdfs://ipaddress/spark/spark-1.2.1.tgz'
    E0427 09:23:24.710765  4842 fetcher.cpp:113] HDFS copyToLocal failed: /usr/lib/hadoop/bin/hadoop fs -copyToLocal 'hdfs://ipaddress/spark/spark-1.2.1.tgz' '/tmp/mesos/slaves/20150427-054938-2933394698-5050-1030-S0/frameworks/20150427-054938-2933394698-5050-1030-0002/executors/20150427-054938-2933394698-5050-1030-S0/runs/5c13004a-3d8c-40a4-bac4-9c07249e1923/spark-1.2.1.tgz'
    copyToLocal: Call From sclq174.lss.emc.com/ipaddress to sclq174.lss.emc.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    

    端口50070- hdfs://ipaddress:50070/spark/spark-1.2.1.tgz

    的HDFS路径
    I0427 13:34:25.295554 16633 fetcher.cpp:214] Fetching URI 'hdfs://ipaddress:50070/spark/spark-1.2.1.tgz'
    E0427 13:34:28.438596 16633 fetcher.cpp:113] HDFS copyToLocal failed: /usr/lib/hadoop/bin/hadoop fs -copyToLocal 'hdfs://ipaddress:50070/spark/spark-1.2.1.tgz' '/tmp/mesos/slaves/20150427-054938-2933394698-5050-1030-S0/frameworks/20150427-054938-2933394698-5050-1030-0008/executors/20150427-054938-2933394698-5050-1030-S0/runs/2fc7886a-cfff-4cb2-b2f6-25988ca0f8e3/spark-1.2.1.tgz'
    copyToLocal: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.; Host Details : local host is: 
    

    为什么它不起作用的任何想法?

1 个答案:

答案 0 :(得分:1)

Spark支持不同的获取二进制文件的方法:

  • file: - 绝对路径和file:/ URI由驱动程序的HTTP文件服务器提供,每个执行程序都从驱动程序HTTP服务器提取文件。
  • hdfs:http:https:ftp: - 这些从预期的URI中下拉文件和JAR
  • local: - 以local:/开头的URI预计将作为每个工作节点上的本地文件存在

      无法从驱动程序访问
    1. file://home/labadmin/spark-1.2.1.tgz。您可能想要使用local:/ URI。
    2. sclq174.lss.emc.com:8020
    3. 上可能没有运行HDFS服务器
    4. Hadoop无法识别URI格式,您应该使用实际的IP地址替换主机名,以便使其工作,例如: 192.168.1.1:50070