当我运行包含Spark的可执行jar并建立与事件中心的连接以从那里接收事件时,我遇到了这个问题。
我注意到长时间运行后收到此错误。我正在使用spark 2.2.1
Exception in thread "main" org.apache.spark.sql.streaming.StreamingQueryException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): com.microsoft.azure.eventhubs.EventHubException: com.microsoft.azure.eventhubs.impl.AmqpException: The service was unable to process the request; please retry the operation. For more information on exception types and proper exception handling, please refer to http://go.microsoft.com/fwlink/?LinkId=761101 Reference:9d500822-8b98-4fdf-9c51-df0d34a8bc77, TrackingId:dd7a2bc0000024de000e68a85b6c20e4_G26_B28, SystemTracker:edgeeventhub:eventhub:edgeeventhub1~16383|$default, Timestamp:8/9/2018 11:09:25 AM, errorContext[NS: edgeeventhub.servicebus.windows.net, PATH: edgeeventhub1/ConsumerGroups/$Default/Partitions/0, REFERENCE_ID: 532b70_763_G26_1533812965268, PREFETCH_COUNT: 10, LINK_CREDIT: 10, PREFETCH_Q_LEN: 0]
at com.microsoft.azure.eventhubs.impl.ExceptionUtil.toException(ExceptionUtil.java:39)
at com.microsoft.azure.eventhubs.impl.MessageReceiver.onClose(MessageReceiver.java:606)
at com.microsoft.azure.eventhubs.impl.BaseLinkHandler.processOnClose(BaseLinkHandler.java:50)
at com.microsoft.azure.eventhubs.impl.BaseLinkHandler.handleRemoteLinkClosed(BaseLinkHandler.java:70)
at com.microsoft.azure.eventhubs.impl.BaseLinkHandler.onLinkRemoteClose(BaseLinkHandler.java:36)
at org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:176)
at org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108)
at org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:324)
at org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:291)
at com.microsoft.azure.eventhubs.impl.MessagingFactory$RunReactor.run(MessagingFactory.java:462)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: com.microsoft.azure.eventhubs.impl.AmqpException: The service was unable to process the request; please retry the operation. For more information on exception types and proper exception handling, please refer to http://go.microsoft.com/fwlink/?LinkId=761101 Reference:9d500822-8b98-4fdf-9c51-df0d34a8bc77, TrackingId:dd7a2bc0000024de000e68a85b6c20e4_G26_B28, SystemTracker:edgeeventhub:eventhub:edgeeventhub1~16383|$default, Timestamp:8/9/2018 11:09:25 AM
... 13 more
Driver stacktrace:
=== Streaming Query ===
Identifier: [id = 0bad359d-1988-41bd-b30a-9ef010597683, runId = f9c042ef-efe9-457c-9235-49644f076786]
Current Committed Offsets: {}
Current Available Offsets: {org.apache.spark.sql.eventhubs.EventHubsSource@3a969af2: {"edgeeventhub1":{"1":9110,"0":9105}}}
Current State: ACTIVE
Thread State: RUNNABLE
Logical Plan:
Project [Offset#13L, Time (readable)#21, Timestamp#30L, Body#40]
+- Project [cast(body#0 as string) AS Body#40, Offset#13L, sequenceNumber#2L, enqueuedTime#3, publisher#4, partitionKey#5, Time (readable)#21, Timestamp#30L]
+- Project [body#0, Offset#13L, sequenceNumber#2L, enqueuedTime#3, publisher#4, partitionKey#5, Time (readable)#21, cast(enqueuedTime#3 as bigint) AS Timestamp#30L]
+- Project [body#0, Offset#13L, sequenceNumber#2L, enqueuedTime#3, publisher#4, partitionKey#5, cast(enqueuedTime#3 as timestamp) AS Time (readable)#21]
+- Project [body#0, cast(offset#1 as bigint) AS Offset#13L, sequenceNumber#2L, enqueuedTime#3, publisher#4, partitionKey#5]
+- StreamingExecutionRelation org.apache.spark.sql.eventhubs.EventHubsSource@3a969af2, [body#0, offset#1, sequenceNumber#2L, enqueuedTime#3, publisher#4, partitionKey#5]
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches(StreamExecution.scala:343)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:206)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): com.microsoft.azure.eventhubs.EventHubException: com.microsoft.azure.eventhubs.impl.AmqpException: The service was unable to process the request; please retry the operation. For more information on exception types and proper exception handling, please refer to http://go.microsoft.com/fwlink/?LinkId=761101 Reference:9d500822-8b98-4fdf-9c51-df0d34a8bc77, TrackingId:dd7a2bc0000024de000e68a85b6c20e4_G26_B28, SystemTracker:edgeeventhub:eventhub:edgeeventhub1~16383|$default, Timestamp:8/9/2018 11:09:25 AM, errorContext[NS: edgeeventhub.servicebus.windows.net, PATH: edgeeventhub1/ConsumerGroups/$Default/Partitions/0, REFERENCE_ID: 532b70_763_G26_1533812965268, PREFETCH_COUNT: 10, LINK_CREDIT: 10, PREFETCH_Q_LEN: 0]
at com.microsoft.azure.eventhubs.impl.ExceptionUtil.toException(ExceptionUtil.java:39)
at com.microsoft.azure.eventhubs.impl.MessageReceiver.onClose(MessageReceiver.java:606)
at com.microsoft.azure.eventhubs.impl.BaseLinkHandler.processOnClose(BaseLinkHandler.java:50)
at com.microsoft.azure.eventhubs.impl.BaseLinkHandler.handleRemoteLinkClosed(BaseLinkHandler.java:70)
at com.microsoft.azure.eventhubs.impl.BaseLinkHandler.onLinkRemoteClose(BaseLinkHandler.java:36)
at org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:176)
at org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108)
at org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:324)
at org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:291)
at com.microsoft.azure.eventhubs.impl.MessagingFactory$RunReactor.run(MessagingFactory.java:462)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: com.microsoft.azure.eventhubs.impl.AmqpException: The service was unable to process the request; please retry the operation. For more information on exception types and proper exception handling, please refer to http://go.microsoft.com/fwlink/?LinkId=761101 Reference:9d500822-8b98-4fdf-9c51-df0d34a8bc77, TrackingId:dd7a2bc0000024de000e68a85b6c20e4_G26_B28, SystemTracker:edgeeventhub:eventhub:edgeeventhub1~16383|$default, Timestamp:8/9/2018 11:09:25 AM
... 13 more
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2043)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2062)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2087)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:926)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:924)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:924)
at org.apache.spark.sql.execution.streaming.ForeachSink.addBatch(ForeachSink.scala:49)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatch$1.apply$mcV$sp(StreamExecution.scala:658)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatch$1.apply(StreamExecution.scala:658)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatch$1.apply(StreamExecution.scala:658)
at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:279)
at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runBatch(StreamExecution.scala:657)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(StreamExecution.scala:306)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$apply$mcZ$sp$1.apply(StreamExecution.scala:294)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$apply$mcZ$sp$1.apply(StreamExecution.scala:294)
at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:279)
at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1.apply$mcZ$sp(StreamExecution.scala:294)
at org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:56)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches(StreamExecution.scala:290)
... 1 more
答案 0 :(得分:0)
我通过在天蓝色的事件中心中增加分区来解决了这个问题