应用程序抛出java.lang.NoSuchMethodException
堆栈跟踪
DAGScheduler: Failed to run runJob at ReceiverTracker.scala:275
Exception in thread "Thread-33" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 6.0 failed 4 times, most recent failure: Lost task 0.3 in stage 6.0 (TID 77, 172.20.7.60): java.lang.NoSuchMethodException: org.apache.spark.examples.streaming.KafkaKeyDecoder.<init>(kafka.utils.VerifiableProperties)
java.lang.Class.getConstructor0(Class.java:2810)
java.lang.Class.getConstructor(Class.java:1718)
org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:106)
org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:121)
org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:106)
org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$9.apply(ReceiverTracker.scala:264)
org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$9.apply(ReceiverTracker.scala:257)
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
org.apache.spark.scheduler.Task.run(Task.scala:54)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:745)
根据Link
,已在spark 1.1.0中修复了似乎问题 Spark:1.1.0
卡夫卡:0.8.1.1
答案 0 :(得分:1)
就我而言,正如在评论中所解释的那样,通过删除库冲突,我能够正确地使用来自kafka的数据并将其存储到cassandra,将作业部署到Datastax Analytics Solution中。我发现与开源源代码不同的是,streaming_kafka jar和所有scala库已经包含在执行器类路径中。
所以我建议如下:
我假设您正在构建一个尝试部署的超级jar。
答案 1 :(得分:0)
您缺少包含该方法的Kafka jar。