Question

我正在使用结构化流媒体功能和Kafka进行简单的Spark程序。由于卡夫卡是来源，有2个汇：

接收器1-控制台接收器 - 在所有情况下都能正常工作
Sink 2＆amp; 3-H2和Ignite Foreach下沉

对于第一次运行代码运行正常，但是当我使用检查点位置杀死并重新启动程序时，我收到以下错误

17/07/12 07:11:48 ERROR StreamExecution: Query h2Out [id = 22ce7168-6f12-4220-8f28-f9eaaaba9c6a, runId = 39ecb40a-5b54-4b36-a0da-6e3057d66b2e] terminated with error
java.lang.NoSuchMethodError: org.apache.spark.sql.kafka010.KafkaSource$$anon$1.parseVersion(Ljava/lang/String;I)I
    at org.apache.spark.sql.kafka010.KafkaSource$$anon$1.deserialize(KafkaSource.scala:116)
    at org.apache.spark.sql.kafka010.KafkaSource$$anon$1.deserialize(KafkaSource.scala:99)
    at org.apache.spark.sql.execution.streaming.HDFSMetadataLog.get(HDFSMetadataLog.scala:237)
    at org.apache.spark.sql.kafka010.KafkaSource.initialPartitionOffsets$lzycompute(KafkaSource.scala:129)
    at org.apache.spark.sql.kafka010.KafkaSource.initialPartitionOffsets(KafkaSource.scala:97)
    at org.apache.spark.sql.kafka010.KafkaSource.getBatch(KafkaSource.scala:222)
    at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatch$2$$anonfun$apply$6.apply(StreamExecution.scala:452)
    at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatch$2$$anonfun$apply$6.apply(StreamExecution.scala:448)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
    at scala.collection.Iterator$class.foreach(Iterator.scala:893)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at org.apache.spark.sql.execution.streaming.StreamProgress.foreach(StreamProgress.scala:25)
    at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
    at org.apache.spark.sql.execution.streaming.StreamProgress.flatMap(StreamProgress.scala:25)
    at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatch$2.apply(StreamExecution.scala:448)
    at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatch$2.apply(StreamExecution.scala:448)
    at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:262)
    at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:46)
    at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runBatch(StreamExecution.scala:447)
    at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$1.apply$mcV$sp(StreamExecution.scala:255)
    at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$1.apply(StreamExecution.scala:244)
    at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$1.apply(StreamExecution.scala:244)
    at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:262)
    at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:46)
    at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1.apply$mcZ$sp(StreamExecution.scala:244)
    at org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:43)
    at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches(StreamExecution.scala:239)
    at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:177)

我检查了KafkaSource源代码，parseFunction方法可以通过org.apache.spark.sql.execution.streaming.HDFSMetadataLog获得，我希望jar（spark-sql_2.11-2.1.1.jar）在类路径中可用。有关信息，我正在使用Kafka 0.10.2.1 maven依赖项。

Answer 1

此错误表示您的Spark版本早于2.1.1。 HDFSMetadataLog.parseVersion在Spark 2.1.1中添加，spark-sql-kafka-0-10_2.11-2.1.1.jar调用它。如果您的Spark版本早于2.1.1，您将看到此NoSuchMethodError。

您可以致电SparkSession.version查看您的Spark版本。（例如，只需在Spark shell中键入spark.version）。

火花结构流无法从检查点位置开始

1 个答案: