我正在尝试修改Apache Spark源代码。我创建了一个新方法,并将其添加到我下载的Spark源代码中的RDD.scala文件中。在对RDD.scala进行修改后,我使用
构建了Sparkmvn -Dhadoop.version=2.2.0 -DskipTests clean package
然后我按照提到的here创建了一个示例Scala Spark应用程序 我尝试使用我创建的新函数,并在使用sbt为Spark创建jar时出现编译错误。我如何使用我的修改编译Spark并将修改后的jar附加到我的项目中?我修改的文件是核心项目中的RDD.scala。我从我的Spark应用程序项目的根目录运行sbt包。
这是sbt文件:
name := "N Spark"
version := "1.0"
scalaVersion := "2.11.6"
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "1.3.0"
这是错误:
sbt package
[info] Loading global plugins from /Users/Raggy/.sbt/0.13/plugins
[info] Set current project to Noah Spark (in build file:/Users/r/Downloads/spark-proj/n-spark/)
[info] Updating {file:/Users/r/Downloads/spark-proj/n-spark/}n-spark...
[info] Resolving jline#jline;2.12.1 ...
[info] Done updating.
[info] Compiling 1 Scala source to /Users/r/Downloads/spark-proj/n-spark/target/scala-2.11/classes...
[error] /Users/r/Downloads/spark-proj/n-spark/src/main/scala/SimpleApp.scala:11: value reducePrime is not a member of org.apache.spark.rdd.RDD[Int]
[error] logData.reducePrime(_+_);
[error] ^
[error] one error found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 24 s, completed Apr 11, 2015 2:24:03 AM
更新 这是更新的sbt文件
name := "N Spark"
version := "1.0"
scalaVersion := "2.10"
libraryDependencies += "org.apache.spark" % "1.3.0"
此文件出现以下错误:
[info] Loading global plugins from /Users/Raggy/.sbt/0.13/plugins
/Users/Raggy/Downloads/spark-proj/noah-spark/simple.sbt:7: error: No implicit for Append.Value[Seq[sbt.ModuleID], sbt.impl.GroupArtifactID] found,
so sbt.impl.GroupArtifactID cannot be appended to Seq[sbt.ModuleID]
libraryDependencies += "org.apache.spark" % "1.3.0"
答案 0 :(得分:0)
从libraryDependencies
删除build.sbt
,只需将自定义构建的Spark jar复制到应用程序项目中的lib
目录。