spark-core 1.6.1& lift-json 2.6.3 java.lang.NoClassDefFoundError

时间:2016-07-12 17:14:18

标签: scala apache-spark sbt lift-json

我有一个Spark应用程序,它有一个sbt文件,如下所示 它适用于我的本地机器。但是当我将它提交给运行Spark 1.6.1的EMR时,发生了如下错误:

java.lang.NoClassDefFoundError: net/liftweb/json/JsonAST$JValue

我正在使用“sbt-package”获取jar

Build.sbt:

organization := "com.foo"
name := "FooReport"

version := "1.0"

scalaVersion := "2.10.6"

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "1.6.1"
  ,"net.liftweb" % "lift-json_2.10" % "2.6.3"
  ,"joda-time" % "joda-time" % "2.9.4"
)

你对发生的事情有什么想法吗?

1 个答案:

答案 0 :(得分:0)

我'我找到了一个解决方案,它正在运作!

问题是所有关于sbt package并不包括所有依赖的jar来输出jar。为了解决这个问题,我尝试了sbt-assembly,但我有很多" 重复数据删除"我跑的时候出错了。

毕竟我走到这篇博文,这篇文章的内容清晰明了 http://queirozf.com/entries/creating-scala-fat-jars-for-spark-on-sbt-with-sbt-assembly-plugin

  

为了将Spark作业提交到Spark Cluster(通过spark-submit),   你需要包含所有依赖项(Spark本身除外)   Jar,否则你无法在工作中使用它们。

  1. 创建" assembly.sbt"在/ project文件夹下。
  2. 添加此行addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")
  3. 然后将下面的assemblyMergeStrategy代码粘贴到 build.sbt
  4. assemblyMergeStrategy in assembly := { case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last case PathList("javax", "activation", xs @ _*) => MergeStrategy.last case PathList("org", "apache", xs @ _*) => MergeStrategy.last case PathList("com", "google", xs @ _*) => MergeStrategy.last case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last case PathList("com", "codahale", xs @ _*) => MergeStrategy.last case PathList("com", "yammer", xs @ _*) => MergeStrategy.last case "about.html" => MergeStrategy.rename case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last case "META-INF/mailcap" => MergeStrategy.last case "META-INF/mimetypes.default" => MergeStrategy.last case "plugin.properties" => MergeStrategy.last case "log4j.properties" => MergeStrategy.last case x => val oldStrategy = (assemblyMergeStrategy in assembly).value oldStrategy(x) }

    并运行sbt assembly

    现在你有一个拥有所有依赖关系的大胖子。它可能是基于依赖库的数百MB。对于我的情况,我使用的是Aws EMR,已经安装了Spark 1.6.1。要从jar中排除spark-core lib,您可以使用"提供"关键字:

    "org.apache.spark" %% "spark-core" % "1.6.1" % "provided"
    

    这是最终的build.sbt文件:

    organization := "com.foo"
    name := "FooReport"
    
    version := "1.0"
    
    scalaVersion := "2.10.6"
    
    libraryDependencies ++= Seq(
      "org.apache.spark" %% "spark-core" % "1.6.1" % "provided"
      ,"net.liftweb" % "lift-json_2.10" % "2.6.3"
      ,"joda-time" % "joda-time" % "2.9.4"
    )
    
    assemblyMergeStrategy in assembly := {
      case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
      case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
      case PathList("org", "apache", xs @ _*) => MergeStrategy.last
      case PathList("com", "google", xs @ _*) => MergeStrategy.last
      case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last
      case PathList("com", "codahale", xs @ _*) => MergeStrategy.last
      case PathList("com", "yammer", xs @ _*) => MergeStrategy.last
      case "about.html" => MergeStrategy.rename
      case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last
      case "META-INF/mailcap" => MergeStrategy.last
      case "META-INF/mimetypes.default" => MergeStrategy.last
      case "plugin.properties" => MergeStrategy.last
      case "log4j.properties" => MergeStrategy.last
      case x =>
        val oldStrategy = (assemblyMergeStrategy in assembly).value
        oldStrategy(x)
    }