在Mac OS X上,我使用以下命令从源代码编译Spark:
jacek:~/oss/spark
$ SPARK_HADOOP_VERSION=2.4.0 SPARK_YARN=true SPARK_HIVE=true SPARK_GANGLIA_LGPL=true xsbt
...
[info] Set current project to root (in build file:/Users/jacek/oss/spark/)
> ; clean ; assembly
...
[info] Packaging /Users/jacek/oss/spark/examples/target/scala-2.10/spark-examples-1.0.0-SNAPSHOT-hadoop2.4.0.jar ...
[info] Done packaging.
[info] Done packaging.
[success] Total time: 1964 s, completed May 9, 2014 5:07:45 AM
当我开始./bin/spark-shell
时,我注意到以下WARN消息:
WARN NativeCodeLoader:无法加载native-hadoop库 平台...在适用的情况下使用builtin-java类
可能是什么问题?
jacek:~/oss/spark
$ ./bin/spark-shell
Spark assembly has been built with Hive, including Datanucleus jars on classpath
14/05/09 21:11:17 INFO SecurityManager: Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
14/05/09 21:11:17 INFO SecurityManager: Changing view acls to: jacek
14/05/09 21:11:17 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(jacek)
14/05/09 21:11:17 INFO HttpServer: Starting HTTP Server
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.0.0-SNAPSHOT
/_/
Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0)
Type in expressions to have them evaluated.
Type :help for more information.
...
14/05/09 21:11:49 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
...
答案 0 :(得分:19)
仅在* nix平台上支持本机hadoop库。该 库不适用于Cygwin或Mac OS X平台。
本机hadoop库主要用于GNU / Linus平台和 已经在这些发行版上进行了测试:
- RHEL4 / Fedora的
- Ubuntu的
- Gentoo的
在上述所有发行版中,32/64位原生hadoop库将与相应的32/64位jvm一起使用。
在Mac OS X上似乎应该忽略WARN消息,因为本机库并不存在于平台上。
答案 1 :(得分:4)
根据我的经验,如果您cd
进入/sparkDir/conf
并将spark-env.sh.template
重命名为spark-env.sh
,然后设置JAVA_OPTS
和{{1}它有效。
您还必须修改此hadoop_DIR
行:
/etc/profile