为什么Spark的作业在Mesos上失败了“hadoop:not found”?

时间:2016-04-28 21:19:15

标签: apache-spark mesos mesosphere

我在Debian 8上使用Spark 1.6.1,Hadoop 2.6.4和Mesos 0.28。

尝试通过spark-submit向Mesos群集提交作业时,奴隶在stderr日志中失败并显示以下内容:

I0427 22:35:39.626055 48258 fetcher.cpp:424] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/ad642fcf-9951-42ad-8f86-cc4f5a5cb408-S0\/hduser","items":[{"action":"BYP$
I0427 22:35:39.628031 48258 fetcher.cpp:379] Fetching URI 'hdfs://xxxxxxxxx:54310/sources/spark/SimpleEventCounter.jar'
I0427 22:35:39.628057 48258 fetcher.cpp:250] Fetching directly into the sandbox directory
I0427 22:35:39.628078 48258 fetcher.cpp:187] Fetching URI 'hdfs://xxxxxxx:54310/sources/spark/SimpleEventCounter.jar'
E0427 22:35:39.629243 48258 shell.hpp:93] Command 'hadoop version 2>&1' failed; this is the output:
sh: 1: hadoop: not found
Failed to fetch 'hdfs://xxxxxxx:54310/sources/spark/SimpleEventCounter.jar': Failed to create HDFS client: Failed to execute 'hadoop version 2>&1'; the command was e$
Failed to synchronize with slave (it's probably exited)
  1. 我的Jar文件包含hadoop 2.6二进制文件
  2. spark executor / binary的路径是通过hdfs://链接
  3. 我的工作没有出现在框架标签中,但它们确实出现在状态为“排队”的驱动程序中,他们只是坐在那里直到关闭spark-mesos-dispatcher.sh服务。

1 个答案:

答案 0 :(得分:0)

我看到一个非常类似的错误,我发现我的问题是没有在mesos代理中设置hadoop_home。 我在每个mesos-slave上添加了/ etc / default / mesos-slave(安装路径可能不同)以下行:MESOS_hadoop_home="/path/to/my/hadoop/install/folder/"

编辑:必须在每个从站上安装Hadoop,路径/到/ my / haoop / install /文件夹是本地路径