DSE Spark Streaming + Kafka NoSuchMethodError

时间:2015-07-25 05:55:44

标签: apache-kafka kafka-consumer-api cassandra-2.0 spark-streaming-kafka

我正在尝试提交一个spark streaming + kafka作业,它只是从kafka主题中读取字符串行。但是,我收到以下异常

  

15/07/24 22:39:45 ERROR TaskSetManager:阶段2.0中的任务0失败了4次;堕胎   线程“Thread-49”中的异常org.apache.spark.SparkException:作业因阶段失败而中止:阶段2.0中的任务0失败4次,最近失败:阶段2.0中丢失任务0.3(TID 73,10.11.112.93) :java.lang.NoSuchMethodException:kafka.serializer.StringDecoder。(kafka.utils.VerifiableProperties)           java.lang.Class.getConstructor0(Class.java:2892)           java.lang.Class.getConstructor(Class.java:1723)           org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:106)           org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:121)           org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:106)           org.apache.spark.streaming.scheduler.ReceiverTracker $ ReceiverLauncher $$ anonfun $ 9.apply(ReceiverTracker.scala:264)           org.apache.spark.streaming.scheduler.ReceiverTracker $ ReceiverLauncher $$ anonfun $ 9.apply(ReceiverTracker.scala:257)           org.apache.spark.SparkContext $$ anonfun $ runJob $ 4.适用(SparkContext.scala:1121)           org.apache.spark.SparkContext $$ anonfun $ runJob $ 4.适用(SparkContext.scala:1121)           org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)           org.apache.spark.scheduler.Task.run(Task.scala:54)           org.apache.spark.executor.Executor $ TaskRunner.run(Executor.scala:177)           java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)           java.util.concurrent.ThreadPoolExecutor中的$ Worker.run(ThreadPoolExecutor.java:615)           java.lang.Thread.run(Thread.java:745)

当我检查DSE使用的spark jar文件时,我发现它使用的kafka_2.10-0.8.0.jar确实有这个构造函数。不确定导致错误的原因。这是我的消费者代码

    val sc = new SparkContext(sparkConf)
    val streamingContext = new StreamingContext(sc, SLIDE_INTERVAL)

    val topicMap = kafkaTopics.split(",").map((_, numThreads.toInt)).toMap
    val accessLogsStream = KafkaUtils.createStream(streamingContext, zooKeeper, "AccessLogsKafkaAnalyzer", topicMap)

    val accessLogs = accessLogsStream.map(_._2).map(log => ApacheAccessLog.parseLogLine(log).cache()

更新此异常似乎仅在我提交作业时才会发生。如果我通过粘贴代码来使用spark shell来运行作业,那么它可以正常工作

1 个答案:

答案 0 :(得分:1)

我的自定义解码器遇到了同样的问题。我添加了以下构造函数,它解决了这个问题。

public YourDecoder(VerifiableProperties verifiableProperties)
{

}