如何避免在Spark-Streaming和Kafka中加载依赖项?

时间:2019-04-03 18:19:34

标签: apache-kafka spark-streaming

我试图举例说明kafka和spark-streaming的工作原理,并且在运行该过程时发现问题。

这是一个例外:

  

[错误]原因:   com.fasterxml.jackson.databind.JsonMappingException:不兼容   杰克逊版本:2.9.8

这是build.sbt:

name := "SparkJobs"

version := "1.0"

scalaVersion := "2.11.6"

val sparkVersion = "2.4.1"

val flinkVersion = "1.7.2"

resolvers ++= Seq(
"Typesafe Releases" at "http://repo.typesafe.com/typesafe/releases/",
"apache snapshots" at "http://repository.apache.org/snapshots/",
"confluent.io" at "http://packages.confluent.io/maven/",
"Maven central" at "http://repo1.maven.org/maven2/"
)

libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-sql" % sparkVersion,
"org.apache.spark" %% "spark-streaming" % sparkVersion,
"org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVersion,
"org.apache.spark" %% "spark-hive" % sparkVersion

// ,"org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion
, "org.apache.kafka" %% "kafka-streams-scala" % "2.2.0"
// , "io.confluent" % "kafka-streams-avro-serde" % "5.2.1"
)

//excludeDependencies ++= Seq(
// commons-logging is replaced by jcl-over-slf4j
//  ExclusionRule("jackson-module-scala", "jackson-module-scala")
//
)

这是code

做一个sbt依赖树,我可以看到spark-core_2.11-2.4.1.jarjackson-databind-2.6.7.1,它告诉我它被2.9.8 version驱逐了,这表明库之间存在冲突,但是spark-core_2.11-2.4.1.jar并不是唯一一个,kafka-streams-scala_2.11:2.2.0使用jackson-databind-2.9.8版本,所以我不知道哪个库必须逐出jackson-databind-2.9.8. Spark-core / kafka-streams-scala ?还是两者都有?

如何避免杰克逊library version 2.9.8来启动并运行此任务?

我假设我需要jackson-databind-2.6.7 version ...

使用建议进行更新。仍然无法正常工作。

我已经删除了kafka-stream-scala的依赖关系,该依赖项使用此build.sbt尝试使用jackson 2.9.8

name := "SparkJobs"

version := "1.0"

scalaVersion := "2.11.6"

val sparkVersion = "2.4.1"

val flinkVersion = "1.7.2"

val kafkaStreamScala = "2.2.0"

resolvers ++= Seq(
"Typesafe Releases" at "http://repo.typesafe.com/typesafe/releases/",
"apache snapshots" at "http://repository.apache.org/snapshots/",
"confluent.io" at "http://packages.confluent.io/maven/",
"Maven central" at "http://repo1.maven.org/maven2/"
)


libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion ,
"org.apache.spark" %% "spark-sql" % sparkVersion,
"org.apache.spark" %% "spark-streaming" % sparkVersion,
"org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVersion,
"org.apache.spark" %% "spark-hive" % sparkVersion

)

但是我有了新的exception

更新2

忘记了,现在我明白了第二个例外,我忘记了awaitToTermination。

1 个答案:

答案 0 :(得分:1)

Kafka Streams includes Jackson 2.9.8

但是使用Spark Streaming的Kafka集成时不需要它,因此您实际上应该删除它。

类似地,kafka-streams-avro-serde不是您想要与Spark一起使用的,相反,您可能会发现AbraOSS/ABRiS有用。