sbt组装失败并显示错误:即使包含spark-core和spark-sql库,对象spark也不是org.apache包的成员

时间:2018-10-26 09:56:12

标签: scala sbt sbt-assembly

我正在尝试在spark项目上使用sbt程序集。 sbt编译和打包工作,但是当我尝试sbt组装时出现以下错误:

object spark不是软件包org.apache的成员

我已包含spark核心和spark sql库,并且在我的插件文件中具有sbt-assembly。为什么程序集会产生这些错误?

build.sbt:

name := "redis-record-loader"

scalaVersion := "2.11.8"

val sparkVersion = "2.3.1"
val scalatestVersion = "3.0.3"
val scalatest = "org.scalatest" %% "scalatest" % scalatestVersion

libraryDependencies ++=
  Seq(
    "com.amazonaws" % "aws-java-sdk-s3" % "1.11.347",
    "com.typesafe" % "config" % "1.3.1",
    "net.debasishg" %% "redisclient" % "3.0",
    "org.slf4j" % "slf4j-log4j12" % "1.7.12",
    "org.apache.commons" % "commons-lang3" % "3.0" % "test,it",
    "org.apache.hadoop" % "hadoop-aws" % "2.8.1" % Provided,
    "org.apache.spark" %% "spark-core" % sparkVersion % Provided,
    "org.apache.spark" %% "spark-sql" % sparkVersion % Provided,
    "org.mockito" % "mockito-core" % "2.21.0" % Test,
    scalatest
)

val integrationTestsKey = "it"
val integrationTestLibs = scalatest % integrationTestsKey

lazy val IntegrationTestConfig = config(integrationTestsKey) extend Test

lazy val root = project.in(file("."))
  .configs(IntegrationTestConfig)
  .settings(inConfig(IntegrationTestConfig)(Defaults.testSettings): _*)
  .settings(libraryDependencies ++= Seq(integrationTestLibs))

test in assembly := Seq(
  (test in Test).value,
  (test in IntegrationTestConfig).value
)

assemblyMergeStrategy in assembly := {
    case PathList("META-INF", xs @ _*) => MergeStrategy.discard
    case x => MergeStrategy.first
}

plugins.sbt:

logLevel := Level.Warn

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.6")

完整的错误消息:

/com/elsevier/bos/RedisRecordLoaderIntegrationSpec.scala:11: object spark is not a member of package org.apache
[error] import org.apache.spark.sql.{DataFrame, SaveMode, SparkSession}
[error]                   ^
[error] /Users/jones8/Work/redis-record-loader/src/it/scala/com/elsevier/bos/RedisRecordLoaderIntegrationSpec.scala:26: not found: type SparkSession
[error]   implicit val spark: SparkSession = SparkSession.builder
[error]                       ^
[error] /Users/jones8/Work/redis-record-loader/src/it/scala/com/elsevier/bos/RedisRecordLoaderIntegrationSpec.scala:26: not found: value SparkSession
[error]   implicit val spark: SparkSession = SparkSession.builder
[error]                                      ^
[error] /Users/jones8/Work/redis-record-loader/src/it/scala/com/elsevier/bos/RedisRecordLoaderIntegrationSpec.scala:51: not found: type DataFrame
[error]   val testDataframe0: DataFrame = testData0.toDF()
[error]                       ^
[error] /Users/jones8/Work/redis-record-loader/src/it/scala/com/elsevier/bos/RedisRecordLoaderIntegrationSpec.scala:51: value toDF is not a member of Seq[(String, String)]
[error]   val testDataframe0: DataFrame = testData0.toDF()
[error]                                             ^
[error] /Users/jones8/Work/redis-record-loader/src/it/scala/com/elsevier/bos/RedisRecordLoaderIntegrationSpec.scala:52: not found: type DataFrame
[error]   val testDataframe1: DataFrame = testData1.toDF()
[error]                       ^
[error] /Users/jones8/Work/redis-record-loader/src/it/scala/com/elsevier/bos/RedisRecordLoaderIntegrationSpec.scala:52: value toDF is not a member of Seq[(String, String)]
[error]   val testDataframe1: DataFrame = testData1.toDF()
[error]                                             ^
[error] missing or invalid dependency detected while loading class file 'RedisRecordLoader.class'.
[error] Could not access term spark in package org.apache,
[error] because it (or its dependencies) are missing. Check your build definition for
[error] missing or conflicting dependencies. (Re-run with `-Ylog-classpath` to see the problematic classpath.)
[error] A full rebuild may help if 'RedisRecordLoader.class' was compiled against an incompatible version of org.apache.
[error] missing or invalid dependency detected while loading class file 'RedisRecordLoader.class'.
[error] Could not access type SparkSession in value org.apache.sql,
[error] because it (or its dependencies) are missing. Check your build definition for
[error] missing or conflicting dependencies. (Re-run with `-Ylog-classpath` to see the problematic classpath.)
[error] A full rebuild may help if 'RedisRecordLoader.class' was compiled against an incompatible version of org.apache.sql.
[error] 9 errors found

1 个答案:

答案 0 :(得分:0)

我不能说“我怀疑AWS开发工具包SDK和hadoop-aws版本是否可以正常工作”。您需要hadoop-aws的确切版本来匹配CP上hadoop-common的JAR(毕竟,这是一个同步发布的项目),而针对的aws SDK版本是1.10。 AWS开发工具包的习惯是(a)在每个发行版上都破坏API(b)即使它们不兼容,也要积极地降低新版本的杰克逊,并且(c)导致hadoop-aws代码退化。

如果您真的想使用S3A,最好使用hadoop-2.9,该版本可提供带阴影的1.11.x版本