Question

我一直在IntelliJ上编写代码来运行“DirectKafkaWordCount.scala” 源代码来自Github的apache / spark（spark / example / src / main / scala / org / apache / spark / examples / streaming / DirectKafkaWordCount.scala），如下所示：

import org.apache.spark.SparkConf
import org.apache.spark.streaming._
import org.apache.spark.streaming.kafka010._

object KafkaWordCountTest {

  def main(args: Array[String]) {

    StreamingExamples.setStreamingLogLevels()

    val brokers = "localhost:2181"
    val topics = "test1"

    // Create context with 2 second batch interval
    val sparkConf = new SparkConf().setAppName("KafkaWordCountTest")
    val ssc = new StreamingContext(sparkConf, Seconds(2))

    // Create direct kafka stream with brokers and topics
    val topicsSet = topics.split(",").toSet
    val kafkaParams = Map[String, String]("metadata.broker.list" -> brokers)
    val messages = KafkaUtils.createDirectStream[String, String](
      ssc,
      LocationStrategies.PreferConsistent,
      ConsumerStrategies.Subscribe[String, String](topicsSet, kafkaParams))

    // Get the lines, split them into words, count the words and print
    val lines = messages.map(_.value)
    val words = lines.flatMap(_.split(" "))
    val wordCounts = words.map(x => (x, 1L)).reduceByKey(_ + _)
    wordCounts.print()

    // Start the computation
    ssc.start()
    ssc.awaitTermination()
  }
}

另外，我在pom.xml中添加了依赖项，如下所示：

 <dependencies>
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>2.11.8</version>
    </dependency>

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>1.5.2</version>
    </dependency>

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming_2.11</artifactId>
        <version>1.5.2</version>
    </dependency>

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming-twitter_2.11</artifactId>
        <version>1.6.2</version>
    </dependency>

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
        <version>2.0.2</version>
    </dependency>
</dependencies>

但是，我收到了一条错误消息。

错误：（43,41）在类文件'KafkaUtils.class'中遇到org.apache.spark.internal的错误符号引用。无法访问org.apache.spark包中的term internal。当前类路径可能缺少org.apache.spark.internal的定义，或者KafkaUtils.class可能是针对与当前路径上找到的版本不兼容的版本编译的。 val message = KafkaUtils.createDirectStream [String，String]（

我在Desktop文件夹上解压缩了“spark-1.5.2-bin-hadoop2.4”和“spark-2.2.0-bin-hadoop2.7”。使用spark-shell命令，每个火花在终端上运行良好。我最初安装了Spark 2.2.0版。而且，为了运行Kafka Direct示例，我还安装了Spark 1.5.2。这是一个问题吗？

如何解决这个问题？

谢谢！

在类文件'KafkaUtils.class'中遇到org.apache.spark.internal的错误符号引用

0 个答案: