修改和构建Spark核心

时间:2015-04-11 06:40:27

标签: scala sbt apache-spark rdd

我正在尝试修改Apache Spark源代码。我创建了一个新方法,并将其添加到我下载的Spark源代码中的RDD.scala文件中。在对RDD.scala进行修改后,我使用

构建了Spark
mvn -Dhadoop.version=2.2.0 -DskipTests clean package

然后我按照提到的here创建了一个示例Scala Spark应用程序 我尝试使用我创建的新函数,并在使用sbt为Spark创建jar时出现编译错误。我如何使用我的修改编译Spark并将修改后的jar附加到我的项目中?我修改的文件是核心项目中的RDD.scala。我从我的Spark应用程序项目的根目录运行sbt包。

这是sbt文件:

name := "N Spark"

version := "1.0"

scalaVersion := "2.11.6"

libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "1.3.0"

这是错误:

sbt package
[info] Loading global plugins from /Users/Raggy/.sbt/0.13/plugins
[info] Set current project to Noah Spark (in build file:/Users/r/Downloads/spark-proj/n-spark/)
[info] Updating {file:/Users/r/Downloads/spark-proj/n-spark/}n-spark...
[info] Resolving jline#jline;2.12.1 ...
[info] Done updating.
[info] Compiling 1 Scala source to /Users/r/Downloads/spark-proj/n-spark/target/scala-2.11/classes...
[error] /Users/r/Downloads/spark-proj/n-spark/src/main/scala/SimpleApp.scala:11: value reducePrime is not a member of org.apache.spark.rdd.RDD[Int]
[error]     logData.reducePrime(_+_);
[error]             ^
[error] one error found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 24 s, completed Apr 11, 2015 2:24:03 AM

更新 这是更新的sbt文件

name := "N Spark"

version := "1.0"

scalaVersion := "2.10"

libraryDependencies += "org.apache.spark" % "1.3.0"

此文件出现以下错误:

[info] Loading global plugins from /Users/Raggy/.sbt/0.13/plugins
/Users/Raggy/Downloads/spark-proj/noah-spark/simple.sbt:7: error: No implicit for Append.Value[Seq[sbt.ModuleID], sbt.impl.GroupArtifactID] found,
  so sbt.impl.GroupArtifactID cannot be appended to Seq[sbt.ModuleID]
libraryDependencies += "org.apache.spark" % "1.3.0"

1 个答案:

答案 0 :(得分:0)

libraryDependencies删除build.sbt,只需将自定义构建的Spark jar复制到应用程序项目中的lib目录。