我试图举例说明kafka和spark-streaming的工作原理,并且在运行该过程时发现问题。
这是一个例外:
[错误]原因: com.fasterxml.jackson.databind.JsonMappingException:不兼容 杰克逊版本:2.9.8
这是build.sbt:
name := "SparkJobs"
version := "1.0"
scalaVersion := "2.11.6"
val sparkVersion = "2.4.1"
val flinkVersion = "1.7.2"
resolvers ++= Seq(
"Typesafe Releases" at "http://repo.typesafe.com/typesafe/releases/",
"apache snapshots" at "http://repository.apache.org/snapshots/",
"confluent.io" at "http://packages.confluent.io/maven/",
"Maven central" at "http://repo1.maven.org/maven2/"
)
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-sql" % sparkVersion,
"org.apache.spark" %% "spark-streaming" % sparkVersion,
"org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVersion,
"org.apache.spark" %% "spark-hive" % sparkVersion
// ,"org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion
, "org.apache.kafka" %% "kafka-streams-scala" % "2.2.0"
// , "io.confluent" % "kafka-streams-avro-serde" % "5.2.1"
)
//excludeDependencies ++= Seq(
// commons-logging is replaced by jcl-over-slf4j
// ExclusionRule("jackson-module-scala", "jackson-module-scala")
//
)
这是code
做一个sbt依赖树,我可以看到spark-core_2.11-2.4.1.jar
有jackson-databind-2.6.7.1
,它告诉我它被2.9.8 version
驱逐了,这表明库之间存在冲突,但是spark-core_2.11-2.4.1.jar
并不是唯一一个,kafka-streams-scala_2.11:2.2.0
使用jackson-databind-2.9.8
版本,所以我不知道哪个库必须逐出jackson-databind-2.9.8.
Spark-core / kafka-streams-scala ?还是两者都有?
如何避免杰克逊library version 2.9.8
来启动并运行此任务?
我假设我需要jackson-databind-2.6.7 version
...
使用建议进行更新。仍然无法正常工作。
我已经删除了kafka-stream-scala的依赖关系,该依赖项使用此build.sbt尝试使用jackson 2.9.8
name := "SparkJobs"
version := "1.0"
scalaVersion := "2.11.6"
val sparkVersion = "2.4.1"
val flinkVersion = "1.7.2"
val kafkaStreamScala = "2.2.0"
resolvers ++= Seq(
"Typesafe Releases" at "http://repo.typesafe.com/typesafe/releases/",
"apache snapshots" at "http://repository.apache.org/snapshots/",
"confluent.io" at "http://packages.confluent.io/maven/",
"Maven central" at "http://repo1.maven.org/maven2/"
)
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion ,
"org.apache.spark" %% "spark-sql" % sparkVersion,
"org.apache.spark" %% "spark-streaming" % sparkVersion,
"org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVersion,
"org.apache.spark" %% "spark-hive" % sparkVersion
)
但是我有了新的exception
更新2
忘记了,现在我明白了第二个例外,我忘记了awaitToTermination。
答案 0 :(得分:1)
Kafka Streams includes Jackson 2.9.8
但是使用Spark Streaming的Kafka集成时不需要它,因此您实际上应该删除它。
类似地,kafka-streams-avro-serde
不是您想要与Spark一起使用的,相反,您可能会发现AbraOSS/ABRiS有用。