我正在尝试使用kafka开发Spark应用程序,但出现cNoClassDefFoundError

时间:2018-10-23 05:56:35

标签: apache-spark apache-kafka

我正在学习使用kafka(使用scala API)进行Spark流式传输,并且我正在尝试开发一个简单的应用程序以将kafka应用程序绑定到流式应用程序。使用sbt进行编译是可以的,但是当我进行火花提交时,我总是会遇到相同的错误:java.lang.NoClassDefFoundError。

我安装的Spark版本是2.3.0。

我的build.sbt文件是:

name := "Linking" 
mainClass := Some("Main")
scalaVersion := "2.11.8"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.0" % "provided" 
libraryDependencies += "org.apache.spark" %% "spark-streaming" % "2.3.0" % "provided"
libraryDependencies += "org.apache.spark" % "spark-streaming-kafka-0-10_2.11" % "2.2.0"

,主要的源文件是:

import org.apache.spark._
import org.apache.spark.streaming._

import org.apache.kafka.clients.consumer.ConsumerRecord
import org.apache.kafka.common.serialization.StringDeserializer
import org.apache.spark.streaming.kafka010._
import org.apache.spark.streaming.kafka010.LocationStrategies.PreferConsistent
import org.apache.spark.streaming.kafka010.ConsumerStrategies.Subscribe


object KafkaStreaming {
def main(args: Array[String]): Unit = {
println("Hello, world!")

    val conf = new SparkConf().setMaster("local[2]").setAppName("NetworkWordCount")
    val ssc = new StreamingContext(conf, Seconds(1))
    val kafkaParams = Map[String, Object](
      "bootstrap.servers" -> "localhost:9092,anotherhost:9092",
      "key.deserializer" -> classOf[StringDeserializer],
      "value.deserializer" -> classOf[StringDeserializer],
      "group.id" -> "use_a_separate_group_id_for_each_stream",
      "auto.offset.reset" -> "latest",
      "enable.auto.commit" -> (false: java.lang.Boolean)
    )

    val topics = Array("topicA", "topicB")
    val stream = KafkaUtils.createDirectStream[String, String](
      ssc,
      PreferConsistent,
      Subscribe[String, String](topics, kafkaParams)
    )

    stream.map(record => (record.key, record.value))

}
}

我进行了研究,得到了两种解决方案。库版本有问题,并且在spark-submit中包含--jars spark-streaming-kafka-0-10-assembly,但它们都不起作用。

预先感谢

0 个答案:

没有答案