当我df.show()
打印DataFrame行的内容时,出现此错误:
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Incompatible Jackson version: 2.8.9
at com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:64)
at com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19)
at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:747)
at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:82)
at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
这是我创建df
:
object Test extends App {
val spark = SparkSession.builder()
.config("es.nodes", "XXX.XX.XX.XX")
.config("es.port", "9200")
.config("es.nodes.wan.only", "false")
.config("es.resource","myIndex")
.appName("Test")
.master("local[*]")
.getOrCreate()
val df_source = spark
.read.format("org.elasticsearch.spark.sql")
.option("pushdown", "true")
.load("myIndex")
df_source.show(5)
}
我不在我的build.sbt
中使用杰克逊图书馆。
更新
import sbtassembly.AssemblyPlugin.autoImport.assemblyOption
name := "test"
lazy val spark = "org.apache.spark"
lazy val typesafe = "com.typesafe.akka"
val sparkVersion = "2.2.0"
val elasticSparkVersion = "6.2.4"
val scalaLoggingVersion = "3.7.2"
val slf4jVersion = "1.7.5"
val kafkaVersion = "0.8.0.0"
val akkaVersion = "2.5.9"
val playVersion = "2.6.8"
val sprayVersion = "1.3.2"
val opRabbitVersion = "2.1.0"
val orientdbVersion = "2.2.34"
val livyVersion = "0.5.0-incubating"
val scalaHttpVersion = "2.3.0"
val scoptVersion = "3.3.0"
resolvers ++= Seq(
// repo for op-rabbit client
"SpinGo OSS" at "http://spingo-oss.s3.amazonaws.com/repositories/releases",
"SparkPackagesRepo" at "http://dl.bintray.com/spark-packages/maven",
"cloudera.repo" at "https://repository.cloudera.com/artifactory/cloudera-repos"
)
lazy val commonSettings = Seq(
organization := "org.test",
version := "0.1",
scalaVersion := "2.11.8",
assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = true),
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs @ _*) => MergeStrategy.discard
case PathList("reference.conf") => MergeStrategy.concat
case x => MergeStrategy.first
}
)
val sparkSQL = spark %% "spark-sql" % sparkVersion
val sparkGraphx = spark %% "spark-graphx" % sparkVersion
val sparkMLLib = spark %% "spark-mllib" % sparkVersion
val elasticSpark = "org.elasticsearch" % "elasticsearch-hadoop" % elasticSparkVersion
val livyAPI = "org.apache.livy" % "livy-api" % livyVersion
val livyScalaAPI = "org.apache.livy" %% "livy-scala-api" % livyVersion
val livyClientHttp = "org.apache.livy" % "livy-client-http" % livyVersion
val spingoCore = "com.spingo" %% "op-rabbit-core" % opRabbitVersion
val spingoPlayJson = "com.spingo" %% "op-rabbit-play-json" % opRabbitVersion
val spingoJson4s = "com.spingo" %% "op-rabbit-json4s" % opRabbitVersion
val spingoAirbrake = "com.spingo" %% "op-rabbit-airbrake" % opRabbitVersion
val spingoAkkaStream = "com.spingo" %% "op-rabbit-akka-stream" % opRabbitVersion
val orientDB = "com.orientechnologies" % "orientdb-graphdb" % orientdbVersion excludeAll(
ExclusionRule("commons-beanutils", "commons-beanutils-core"),
ExclusionRule("commons-collections", "commons-collections"),
ExclusionRule("commons-logging", "commons-logging"),
ExclusionRule("stax", "stax-api")
)
val scopt = "com.github.scopt" %% "scopt" % scoptVersion
val spray = "io.spray" %% "spray-json" % sprayVersion
val scalaHttp = "org.scalaj" %% "scalaj-http" % scalaHttpVersion
lazy val graph = (project in file("./app"))
.settings(
commonSettings,
libraryDependencies ++= Seq(sparkSQL, sparkGraphx, sparkMLLib, orientDB,
livyAPI, livyScalaAPI, livyClientHttp, scopt,
spingoCore, scalaHttp,
spray, spingoCore, spingoPlayJson, spingoJson4s,
spingoAirbrake, spingoAkkaStream, elasticSpark)
)
dependencyOverrides += "com.typesafe.akka" %% "akka-stream" % akkaVersion
我试图为Spark添加Jackson库,但它没有解决问题:
val jacksonCore = "com.fasterxml.jackson.core" % "jackson-core" % "2.6.5"
val jacksonDatabind = "com.fasterxml.jackson.core" % "jackson-databind" % "2.6.5"
val jacksonAnnotations = "com.fasterxml.jackson.core" %% "jackson-annotations" % "2.6.5"
val jacksonScala = "com.fasterxml.jackson.module" %% "jackson-module-scala" % "2.6.5"
最后,我这样做了(由于某些原因,最后两个依赖项无法解决):
dependencyOverrides += "com.typesafe.akka" %% "akka-stream" % akkaVersion
dependencyOverrides += "com.fasterxml.jackson.core" % "jackson-core" % "2.8.9"
dependencyOverrides += "com.fasterxml.jackson.core" % "jackson-databind" % "2.8.9"
dependencyOverrides += "com.fasterxml.jackson.core" % "jackson-annotations" % "2.8.9"
dependencyOverrides += "com.fasterxml.jackson.module" %% "jackson-module-scala" % "2.8.9"
dependencyOverrides += "com.fasterxml.jackson.module" % "jackson-module-paranamer" % "2.8.9"
但现在我收到了错误:
Exception in thread "main" java.lang.NoClassDefFoundError: com/fasterxml/jackson/module/scala/DefaultScalaModule$
Caused by: java.lang.ClassNotFoundException: com.fasterxml.jackson.module.scala.DefaultScalaModule$
答案 0 :(得分:2)
这个build.sbt看起来很成问题,因为你混合了许多可能在jackson和其他依赖项中没有对齐的东西。
例如op-rabbit-json4s期待杰克逊为3.5.3,另一方面我认为orientdb-graphdb期待第三版杰克逊(2.2.3)
总之,您需要尽可能地对齐您的依赖项,以确保没有冲突。
在这里,您可以找到一个有用的插件来检查依赖关系https://github.com/jrudolph/sbt-dependency-graph