我正在尝试使用Twitter流媒体运行Spark应用程序。但是,我经常遇到依赖问题。 当我使用org.apache.bahir spark-streaming-twitter依赖时,我遇到了这样的错误:
module not found: org.apache.bahir#spark-streaming-twitter;2.0.0
这是相应的build.sbt文件:
version := "0.1"
scalaVersion := "2.11.12"
libraryDependencies ++= Seq(
"org.apache.bahir" %% "spark-streaming-twitter" % "2.0.0",
"org.apache.spark" %% "spark-core" % "2.3.0",
"org.apache.spark" % "spark-streaming_2.11" % "2.3.0",
"com.typesafe" % "config" % "1.3.0",
"org.twitter4j" % "twitter4j-stream" % "4.0.6"
)
但是当我使用较旧的流式依赖时,我会收到ClassNotFoundException: : org.apache.spark.Logging
错误。
这是相应的build.sbt:
version := "0.1"
scalaVersion := "2.11.12"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "2.3.0",
"org.apache.spark" % "spark-streaming_2.11" % "2.3.0",
"com.typesafe" % "config" % "1.3.0",
"org.twitter4j" % "twitter4j-stream" % "4.0.6",
"org.apache.spark" %% "spark-streaming-twitter" % "1.6.3"
)
为了运行我的应用程序,我运行sbt clean and package
命令。
那么我应该使用哪些依赖项以及如何配置它们来运行我的应用程序?
答案 0 :(得分:0)
Twitter后端已经从Spark中删除了2.0版本,你声明的bahir版本与Spark版本不匹配。最后bahir Twitter已经附带twitter4j-stream
依赖(此时为4.0.4)。使用:
val sparkVersion = "2.3.0"
libraryDependencies ++= Seq(
"org.apache.bahir" %% "spark-streaming-twitter" % sparkVersion,
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-streaming" % sparkVersion
)