启动java spark-streaming应用程序时出现异常

时间:2014-11-07 10:43:35

标签: java apache-spark apache-kafka spark-streaming

应用程序抛出java.lang.NoSuchMethodException

堆栈跟踪

DAGScheduler: Failed to run runJob at ReceiverTracker.scala:275
Exception in thread "Thread-33" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 6.0 failed 4 times, most recent failure: Lost task 0.3 in stage 6.0 (TID 77, 172.20.7.60): java.lang.NoSuchMethodException: org.apache.spark.examples.streaming.KafkaKeyDecoder.<init>(kafka.utils.VerifiableProperties)
        java.lang.Class.getConstructor0(Class.java:2810)
        java.lang.Class.getConstructor(Class.java:1718)
        org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:106)
        org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:121)
        org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:106)
        org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$9.apply(ReceiverTracker.scala:264)
        org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$9.apply(ReceiverTracker.scala:257)
        org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
        org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
        org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
        org.apache.spark.scheduler.Task.run(Task.scala:54)
        org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
        java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        java.lang.Thread.run(Thread.java:745)

根据Link

,已在spark 1.1.0中修复了似乎问题

Spark:1.1.0
卡夫卡:0.8.1.1

2 个答案:

答案 0 :(得分:1)

就我而言,正如在评论中所解释的那样,通过删除库冲突,我能够正确地使用来自kafka的数据并将其存储到cassandra,将作业部署到Datastax Analytics Solution中。我发现与开源源代码不同的是,streaming_kafka jar和所有scala库已经包含在执行器类路径中。

所以我建议如下:

  1. 确保使用相同版本的scala编译器作为spark。
  2. 确保为相同版本编译kafka和streaming_kafka jar。
  3. 检查执行程序类路径中是否已包含scala库,并且不将它们包含在程序包中。
  4. 我假设您正在构建一个尝试部署的超级jar。

答案 1 :(得分:0)

您缺少包含该方法的Kafka jar。