我有一个Spark应用程序,它有一个sbt文件,如下所示 它适用于我的本地机器。但是当我将它提交给运行Spark 1.6.1的EMR时,发生了如下错误:
java.lang.NoClassDefFoundError: net/liftweb/json/JsonAST$JValue
我正在使用“sbt-package”获取jar
Build.sbt:
organization := "com.foo"
name := "FooReport"
version := "1.0"
scalaVersion := "2.10.6"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "1.6.1"
,"net.liftweb" % "lift-json_2.10" % "2.6.3"
,"joda-time" % "joda-time" % "2.9.4"
)
你对发生的事情有什么想法吗?
答案 0 :(得分:0)
我'我找到了一个解决方案,它正在运作!
问题是所有关于sbt package
并不包括所有依赖的jar来输出jar。为了解决这个问题,我尝试了sbt-assembly
,但我有很多" 重复数据删除"我跑的时候出错了。
毕竟我走到这篇博文,这篇文章的内容清晰明了 http://queirozf.com/entries/creating-scala-fat-jars-for-spark-on-sbt-with-sbt-assembly-plugin
为了将Spark作业提交到Spark Cluster(通过spark-submit), 你需要包含所有依赖项(Spark本身除外) Jar,否则你无法在工作中使用它们。
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")
assemblyMergeStrategy in assembly := {
case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
case PathList("org", "apache", xs @ _*) => MergeStrategy.last
case PathList("com", "google", xs @ _*) => MergeStrategy.last
case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last
case PathList("com", "codahale", xs @ _*) => MergeStrategy.last
case PathList("com", "yammer", xs @ _*) => MergeStrategy.last
case "about.html" => MergeStrategy.rename
case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last
case "META-INF/mailcap" => MergeStrategy.last
case "META-INF/mimetypes.default" => MergeStrategy.last
case "plugin.properties" => MergeStrategy.last
case "log4j.properties" => MergeStrategy.last
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}
并运行sbt assembly
现在你有一个拥有所有依赖关系的大胖子。它可能是基于依赖库的数百MB。对于我的情况,我使用的是Aws EMR,已经安装了Spark 1.6.1。要从jar中排除spark-core lib,您可以使用"提供"关键字:
"org.apache.spark" %% "spark-core" % "1.6.1" % "provided"
这是最终的build.sbt文件:
organization := "com.foo"
name := "FooReport"
version := "1.0"
scalaVersion := "2.10.6"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "1.6.1" % "provided"
,"net.liftweb" % "lift-json_2.10" % "2.6.3"
,"joda-time" % "joda-time" % "2.9.4"
)
assemblyMergeStrategy in assembly := {
case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
case PathList("org", "apache", xs @ _*) => MergeStrategy.last
case PathList("com", "google", xs @ _*) => MergeStrategy.last
case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last
case PathList("com", "codahale", xs @ _*) => MergeStrategy.last
case PathList("com", "yammer", xs @ _*) => MergeStrategy.last
case "about.html" => MergeStrategy.rename
case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last
case "META-INF/mailcap" => MergeStrategy.last
case "META-INF/mimetypes.default" => MergeStrategy.last
case "plugin.properties" => MergeStrategy.last
case "log4j.properties" => MergeStrategy.last
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}