Spark Streaming + Kafka集成0.8.2.1

时间:2019-01-14 15:01:14

标签: scala apache-spark apache-kafka spark-streaming

我在将spark与kafka集成时遇到问题。我使用spark-streaming-kafka-0-8。我用SBT编译。 这是我的代码:

import org.apache.spark.SparkConf
import org.apache.spark.streaming._
import org.apache.kafka.clients.consumer.ConsumerRecord
import org.apache.kafka.common.serialization.StringDeserializer
import org.apache.spark.streaming.kafka._
object sparkKafka {

    def main(args: Array[String]) {

        val sparkConf = new SparkConf().setAppName("KafkaWordCount").setMaster("local[*]")

        val ssc = new StreamingContext(sparkConf, Seconds(2))

        val kafkaStream = KafkaUtils.createStream(ssc,
    "localhost:2181", "spark stream",  Map("customer" -> 2))

        kafkaStream.print()
        ssc.start()
        ssc.awaitTermination()
    }
}

我收到此错误:

`[info] Running sparkKafka
[error] (run-main-0) java.lang.NoClassDefFoundError: scala/Product$class
[error] java.lang.NoClassDefFoundError: scala/Product$class
[error]         at org.apache.spark.SparkConf$DeprecatedConfig.<init>(SparkConf.scala:723)
[error]         at org.apache.spark.SparkConf$.<init>(SparkConf.scala:571)
[error]         at org.apache.spark.SparkConf$.<clinit>(SparkConf.scala)
[error]         at org.apache.spark.SparkConf.set(SparkConf.scala:92)
[error]         at org.apache.spark.SparkConf.set(SparkConf.scala:81)
[error]         at org.apache.spark.SparkConf.setAppName(SparkConf.scala:118)
[error]         at sparkKafka$.main(sparkKafka.scala:15)
[error]         at sparkKafka.main(sparkKafka.scala)
[error]         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[error]         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[error]         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[error]         at java.lang.reflect.Method.invoke(Method.java:498)
[error] Caused by: java.lang.ClassNotFoundException: scala.Product$class
[error]         at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
[error]         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
[error]         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
[error]         at org.apache.spark.SparkConf$DeprecatedConfig.<init>(SparkConf.scala:723)
[error]         at org.apache.spark.SparkConf$.<init>(SparkConf.scala:571)
[error]         at org.apache.spark.SparkConf$.<clinit>(SparkConf.scala)
[error]         at org.apache.spark.SparkConf.set(SparkConf.scala:92)
[error]         at org.apache.spark.SparkConf.set(SparkConf.scala:81)
[error]         at org.apache.spark.SparkConf.setAppName(SparkConf.scala:118)
[error]         at sparkKafka$.main(sparkKafka.scala:15)
[error]         at sparkKafka.main(sparkKafka.scala)
[error]         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [error]         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 [error]         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[error]         at java.lang.reflect.Method.invoke(Method.java:498)
[error] Nonzero exit code: 1
[error] (Compile / run) Nonzero exit code: 1
[error] Total time: 6 s, completed Jan 14, 2019 2:19:15 PM.`

这是我的build.sbt文件:

libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.2.0" libraryDependencies += "org.apache.spark" % "spark-streaming_2.11" % "2.2.0" libraryDependencies += "org.apache.spark" % "spark-streaming-kafka-0-10_2.11" % "2.2.0" libraryDependencies += "org.apache.spark" % "spark-streaming-kafka-0-8_2.11" % "2.2.0"

如何在Kafka中插入Spark Streaming?我什至有一个问题spark-streaming-kafka-0-10 ....

谢谢

1 个答案:

答案 0 :(得分:1)

这是Scala或Spark的版本问题。确保首先使用Scala 2.11

如果您使用的是Kafka 0.10或更高版本(如果您最近安装了Kafka,并且仅在本地运行,则可能是这样),那么您不应该使用kafka-0-8软件包。

请勿将spark-streaming-kafka-0-8spark-streaming-kafka-0-10混合

因此,如果您想使用0-10as answered previously,则软件包必须为org.apache.spark.streaming.kafka010,而不是org.apache.spark.streaming.kafka

此外,请注意0-8确实使用了Zookeeper(例如localhost:2181),而0-10没有使用Zookeeper。