我是新的scala和SBT构建文件。在入门教程中,应该通过sbt-spark-package插件直接将火花依赖项添加到scala项目中,但出现以下错误:
[error] (run-main-0) java.lang.NoClassDefFoundError: org/apache/spark/SparkContext
请提供资源,以进一步了解可能导致错误的原因,因为我想更全面地了解过程。
代码:
trait SparkSessionWrapper {
lazy val spark: SparkSession = {
SparkSession
.builder()
.master("local")
.appName("spark citation graph")
.getOrCreate()
}
val sc = spark.sparkContext
}
import org.apache.spark.graphx.GraphLoader
object Test extends SparkSessionWrapper {
def main(args: Array[String]) {
println("Testing, testing, testing, testing...")
var filePath = "Desktop/citations.txt"
val citeGraph = GraphLoader.edgeListFile(sc, filepath)
println(citeGraph.vertices.take(1))
}
}
plugins.sbt
resolvers += "bintray-spark-packages" at "https://dl.bintray.com/spark-packages/maven/"
addSbtPlugin("org.spark-packages" % "sbt-spark-package" % "0.2.6")
build.sbt-工作。为什么libraryDependencies运行/工作?
spName := "yewno/citation_graph"
version := "0.1"
scalaVersion := "2.11.12"
sparkVersion := "2.2.0"
sparkComponents ++= Seq("core", "sql", "graphx")
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "2.2.0",
"org.apache.spark" %% "spark-sql" % "2.2.0",
"org.apache.spark" %% "spark-graphx" % "2.2.0"
)
build.sbt-不起作用。希望它能够正确编译并运行
spName := "yewno/citation_graph"
version := "0.1"
scalaVersion := "2.11.12"
sparkVersion := "2.2.0"
sparkComponents ++= Seq("core", "sql", "graphx")
用于解释的奖励+指向资源的链接,以了解有关SBT构建过程,jar文件以及任何其他可以帮助我快速入门的信息的信息!
答案 0 :(得分:1)
sbt-spark-package plugin在provided
范围内提供依赖项:
sparkComponentSet.map { component =>
"org.apache.spark" %% s"spark-$component" % sparkVersion.value % "provided"
}.toSeq
我们可以通过从sbt运行show libraryDependencies
来确认这一点:
[info] * org.scala-lang:scala-library:2.11.12
[info] * org.apache.spark:spark-core:2.2.0:provided
[info] * org.apache.spark:spark-sql:2.2.0:provided
[info] * org.apache.spark:spark-graphx:2.2.0:provided
provided
范围的意思是:
依赖关系将是编译和测试的一部分,但不包括在 运行时。
因此sbt run
抛出java.lang.NoClassDefFoundError: org/apache/spark/SparkContext
如果我们真的想在provided
类路径中包含run
依赖项,那么@douglaz建议:
run in Compile := Defaults.runTask(fullClasspath in Compile, mainClass in (Compile, run), runner in (Compile, run)).evaluated