Spark Streaming 2.2.0的NoSuchMethodError。和卡夫卡0.8

时间:2018-01-25 15:45:17

标签: apache-spark apache-kafka spark-streaming

我正在尝试将Spark Streaming 2.2.0与Kafka 0.8一起使用。

我已按照此文档:https://spark.apache.org/docs/latest/streaming-kafka-0-8-integration.html

但我有一个问题:

[WARN ] 2018-01-25 14:54:01,332 org.apache.spark.scheduler.TaskSetManager - Lost task 3.0 in stage 0.0 (TID 3, ip-10-0-155-42.eu-west-1.compute.internal, executor 8): java.lang.NoSuchMethodError: net.jpountz.util.Utils.checkRange([BII)V
    at org.apache.kafka.common.message.KafkaLZ4BlockInputStream.read(KafkaLZ4BlockInputStream.java:176)

关于dependencyGraph,似乎

org.apache.spark:spark-streaming-kafka-0-8_2.11:2.2.0
  org.apache.spark:spark-streaming_2.11:2.2.0
    org.apache.spark:spark-core_2.11:2.2.0
      net.jpountz.lz4:lz4:1.3.0

卡夫卡需要lz4:1.2.0。

[更新]如果我将lz4的版本强制为1.2.0。我还有另一个问题

Caused by: java.lang.NoClassDefFoundError: net/jpountz/util/SafeUtils
    at org.apache.spark.io.LZ4BlockInputStream.read(LZ4BlockInputStream.java:124)
    at java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2606)
    at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2622)
    at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3099)
    at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:853)
    at java.io.ObjectInputStream.<init>(ObjectInputStream.java:349)
    at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.<init>(JavaSerializer.scala:63)
    at org.apache.spark.serializer.JavaDeserializationStream.<init>(JavaSerializer.scala:63)
    at org.apache.spark.serializer.JavaSerializerInstance.deserializeStream(JavaSerializer.scala:122)
    at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:291)
    at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:226)
    at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1303)
    at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:206)
    at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:66)
    at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)
    at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)
    at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:81)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)

我该如何解决?

0 个答案:

没有答案