我正在尝试使用Maven构建Spark 1.2。我的目标是在Hadoop 2.2上使用PySpark和YARN。
我看到这只能通过使用Maven构建Spark来实现。首先,这是真的吗?
如果是,那么下面的日志中有什么问题?我该如何纠正?
C:\Spark\spark-1.2.0>mvn -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests
clean package
Picked up _JAVA_OPTIONS: -Djava.net.preferIPv4Stack=true
[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO]
[INFO] Spark Project Parent POM
[INFO] Spark Project Networking
[INFO] Spark Project Shuffle Streaming Service
[INFO] Spark Project Core
[INFO] Spark Project Bagel
[INFO] Spark Project GraphX
[INFO] Spark Project Streaming
[INFO] Spark Project Catalyst
[INFO] Spark Project SQL
[INFO] Spark Project ML Library
[INFO] Spark Project Tools
[INFO] Spark Project Hive
[INFO] Spark Project REPL
[INFO] Spark Project YARN Parent POM
[INFO] Spark Project YARN Stable API
[INFO] Spark Project Assembly
[INFO] Spark Project External Twitter
[INFO] Spark Project External Flume Sink
[INFO] Spark Project External Flume
[INFO] Spark Project External MQTT
[INFO] Spark Project External ZeroMQ
[INFO] Spark Project External Kafka
[INFO] Spark Project Examples
[INFO] Spark Project YARN Shuffle Service
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building Spark Project Parent POM 1.2.0
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ spark-parent ---
[INFO] Deleting C:\Spark\spark-1.2.0\target
[INFO]
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-versions) @ spark-parent
---
[INFO]
[INFO] --- build-helper-maven-plugin:1.8:add-source (add-scala-sources) @ spark-
parent ---
[INFO] Source directory: C:\Spark\spark-1.2.0\src\main\scala added.
[INFO]
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ spark-parent --
-
[INFO]
[INFO] --- scala-maven-plugin:3.2.0:compile (scala-compile-first) @ spark-parent
---
[INFO] No sources to compile
[INFO]
[INFO] --- build-helper-maven-plugin:1.8:add-test-source (add-scala-test-sources
) @ spark-parent ---
[INFO] Test Source directory: C:\Spark\spark-1.2.0\src\test\scala added.
[INFO]
[INFO] --- scala-maven-plugin:3.2.0:testCompile (scala-test-compile-first) @ spa
rk-parent ---
[INFO] No sources to compile
[INFO]
[INFO] --- maven-dependency-plugin:2.9:build-classpath (default) @ spark-parent
---
[INFO] Wrote classpath file 'C:\Spark\spark-1.2.0\target\spark-test-classpath.tx
t'.
[INFO]
[INFO] --- gmavenplus-plugin:1.2:execute (default) @ spark-parent ---
[INFO] Using Groovy 2.3.7 to perform execute.
[INFO]
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ spark-p
arent ---
[INFO]
[INFO] --- maven-shade-plugin:2.2:shade (default) @ spark-parent ---
[INFO] Including org.spark-project.spark:unused:jar:1.0.0 in the shaded jar.
[INFO] Replacing original artifact with shaded artifact.
[INFO]
[INFO] --- maven-source-plugin:2.2.1:jar-no-fork (create-source-jar) @ spark-par
ent ---
[INFO]
[INFO] --- scalastyle-maven-plugin:0.4.0:check (default) @ spark-parent ---
[WARNING] sourceDirectory is not specified or does not exist value=C:\Spark\spar
k-1.2.0\src\main\scala
Saving to outputFile=C:\Spark\spark-1.2.0\scalastyle-output.xml
Processed 0 file(s)
Found 0 errors
Found 0 warnings
Found 0 infos
Finished in 32 ms
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building Spark Project Networking 1.2.0
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ spark-network-common_2
.10 ---
[INFO] Deleting C:\Spark\spark-1.2.0\network\common\target
[INFO]
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-versions) @ spark-networ
k-common_2.10 ---
[INFO]
[INFO] --- build-helper-maven-plugin:1.8:add-source (add-scala-sources) @ spark-
network-common_2.10 ---
[INFO] Source directory: C:\Spark\spark-1.2.0\network\common\src\main\scala adde
d.
[INFO]
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ spark-network-c
ommon_2.10 ---
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ spark-netw
ork-common_2.10 ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory C:\Spark\spark-1.2.0\network\common\s
rc\main\resources
[INFO] Copying 3 resources
[INFO]
[INFO] --- scala-maven-plugin:3.2.0:compile (scala-compile-first) @ spark-networ
k-common_2.10 ---
[WARNING] Zinc server is not available at port 3030 - reverting to normal increm
ental compile
[INFO] Using incremental compilation
[INFO] compiler plugin: BasicArtifact(org.scalamacros,paradise_2.10.4,2.0.1,null
)
[INFO] Compiling 42 Java sources to C:\Spark\spark-1.2.0\network\common\target\s
cala-2.10\classes...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Spark Project Parent POM ........................... SUCCESS [ 5.267 s]
[INFO] Spark Project Networking ........................... FAILURE [ 1.922 s]
[INFO] Spark Project Shuffle Streaming Service ............ SKIPPED
[INFO] Spark Project Core ................................. SKIPPED
[INFO] Spark Project Bagel ................................ SKIPPED
[INFO] Spark Project GraphX ............................... SKIPPED
[INFO] Spark Project Streaming ............................ SKIPPED
[INFO] Spark Project Catalyst ............................. SKIPPED
[INFO] Spark Project SQL .................................. SKIPPED
[INFO] Spark Project ML Library ........................... SKIPPED
[INFO] Spark Project Tools ................................ SKIPPED
[INFO] Spark Project Hive ................................. SKIPPED
[INFO] Spark Project REPL ................................. SKIPPED
[INFO] Spark Project YARN Parent POM ...................... SKIPPED
[INFO] Spark Project YARN Stable API ...................... SKIPPED
[INFO] Spark Project Assembly ............................. SKIPPED
[INFO] Spark Project External Twitter ..................... SKIPPED
[INFO] Spark Project External Flume Sink .................. SKIPPED
[INFO] Spark Project External Flume ....................... SKIPPED
[INFO] Spark Project External MQTT ........................ SKIPPED
[INFO] Spark Project External ZeroMQ ...................... SKIPPED
[INFO] Spark Project External Kafka ....................... SKIPPED
[INFO] Spark Project Examples ............................. SKIPPED
[INFO] Spark Project YARN Shuffle Service ................. SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 8.048 s
[INFO] Finished at: 2015-02-09T10:17:47+08:00
[INFO] Final Memory: 49M/331M
[INFO] ------------------------------------------------------------------------
[**ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.2.0:compi
le (scala-compile-first) on project spark-network-common_2.10: wrap: java.io.IOE
xception: Cannot run program "javac": CreateProcess error=2, The system cannot f
ind the file specified -> [Help 1]**
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e swit
ch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please rea
d the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionE
xception
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn <goals> -rf :spark-network-common_2.10
答案 0 :(得分:1)
我首先安装了JRE而不是JDK。我的环境变量仍引用了JRE文件夹,因此无法找到javac.exe二进制文件。
答案 1 :(得分:1)
使用Spark构建的“怪癖”是,如果确定需要,它可以下载自己的Maven版本。
当您运行./build/mvn clean package时,不是直接运行Maven,而是运行Spark专有脚本。脚本要做的第一件事是检查您的mvn --version是否足够新,是否适合项目确定所需的版本(已在pom.xml文件中设置)。
这很重要,因为如果您运行的是旧版本的maven,Spark可能会下载其他maven版本并安装并使用它。
一些关键的事情:
谢谢
答案 2 :(得分:0)
对于此问题,您需要在.bashrc
文件中正确设置Java环境路径。然后,您需要在该集合maven路径上构建maven,以进行该检查mvn -version
。
然后它将自动构建而不会出错。