如何在IntelliJ IDEA中使用Kafka Direct Stream运行Spark Streaming应用程序?

时间:2018-12-30 20:19:33

标签: scala apache-spark apache-kafka spark-streaming

我正在使用 Kafka 运行 Spark 流式传输程序,并出现错误。所有导入均已完成,看起来已解决,没有任何问题。

我用 IntelliJ IDEA 编写了很少的代码,并且在第一次运行该程序时遇到错误,我是 Java 的新手,但是来自 C#< / em>背景。因此无法理解问题。 zookeeper 服务已启动, kafka服务器也已启动,并且还创建了一个名为topicA的主题。生产者也准备好流式传输数据,但是我在 IntelliJ 中运行代码以侦听队列时遇到问题

def main(args: Array[String]) {
  val kafkaParams = Map[String, Object](
    "bootstrap.servers" -> "localhost:9092",
    "key.deserializer" -> classOf[StringDeserializer],
    "value.deserializer" -> classOf[StringDeserializer],
    "group.id" -> "0",
    "auto.offset.reset" -> "latest",
    "enable.auto.commit" -> (false: java.lang.Boolean)
  )
  val conf = new SparkConf().setAppName("Simple Streaming Application")
  val ssc = new StreamingContext(conf, Seconds(5))
  val topics = Array("topicA")
  val stream = KafkaUtils.createDirectStream[String, String](
    ssc,
    PreferConsistent,
    Subscribe[String, String](topics, kafkaParams)
  )

  stream.foreachRDD { rdd =>
    // Get the offset ranges in the RDD
    val offsetRanges = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
    for (o <- offsetRanges) {
      println(s"${o.topic} ${o.partition} offsets: ${o.fromOffset} to ${o.untilOffset}")
    }
  }

  ssc.start

  // the above code is printing out topic details every 5 seconds
  // until you stop it.

  ssc.stop(stopSparkContext = false)
}

产生的异常是:

Exception in thread "main" java.lang.VerifyError: class scala.collection.mutable.WrappedArray overrides final method toBuffer.()Lscala/collection/mutable/Buffer;
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.spark.SparkConf.loadFromSystemProperties(SparkConf.scala:75)
at org.apache.spark.SparkConf.<init>(SparkConf.scala:70)
at org.apache.spark.SparkConf.<init>(SparkConf.scala:57)
at sparkStreamClass$.main(sparkStreamClass.scala:20)
at sparkStreamClass.main(sparkStreamClass.scala)

这是我的pom.xml

    <?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.learnStreaming</groupId>
    <artifactId>sparkProjectArtifact</artifactId>
    <version>1.0-SNAPSHOT</version>

<dependencies>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.3.1</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kafka-0-10 -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
        <version>2.3.1</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming_2.11</artifactId>
        <version>2.3.1</version>
        <scope>provided</scope>
    </dependency>

</dependencies>
</project>

3 个答案:

答案 0 :(得分:1)

修改了pom.xml,对我有用!

 <properties>
    <spark.version>2.1.0</spark.version>
</properties>

<dependencies>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>${spark.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming_2.11</artifactId>
        <version>${spark.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
        <version>${spark.version}</version>
    </dependency>
</dependencies>

答案 1 :(得分:0)

我也遇到了这个问题,这是由于使用了错误的Scala版本引起的。例如,我在我的maven build插件中定义了scala2.11.8,但我使用的是scala2.13。因此请检查pom build scala版本并使用scala verison在你的思想中。

答案 2 :(得分:0)

由于SCALA和SPARK库之间的版本不兼容,我们遇到了此问题

当我进行以下配置时,我得到了类似的错误: spark-core_2.11,spark-sql_2.11,Scala-2.13

我在系统中安装了 scala 2.11 ,并向程序模块添加了新的Scala SDk库,并且可以正常工作。