数据流错误管道错误和503服务不可用错误

时间:2016-07-30 01:05:03

标签: google-cloud-storage google-cloud-platform google-cloud-dataflow

我在数据流管道运行几个小时之后就遇到了以下错误,我认为这会导致数据丢失。我基本上已经失去了几个月的数据,但是管道已经成功完成,并且它需要花费4个多小时来完成(这次是12小时)。

(c1e6e4a686086ce4):java.io.IOException:com.google.api.client.googleapis.json.GoogleJsonResponseException:503 com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.waitForCompletionAndThrowIfUploadFailed上的服务不可用(AbstractGoogleAsyncWriteChannel.java:431 )com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.close(AbstractGoogleAsyncWriteChannel.java:289)at com.google.cloud.dataflow.sdk.runners.worker.TextSink $ TextFileWriter.close(TextSink.java:243)at at Com.google.cloud.dataflow.sdk.util.common.worker.WriteOperation.finish(WriteOperation.java:100)at com.google.cloud.dataflow.sdk.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java) :77)com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.executeWork(DataflowWorker.java:254)com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.doWork(DataflowWorker.java) :191)com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:144)at com.go com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness $ WorkerThread.call(DataflowWorkerHarness.java:161)上的ogle.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness $ WorkerThread.doWork(DataflowWorkerHarness.java:180) )在java.util的java.util.concurrent.FutureTask.run(FutureTask.java:266)的com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness $ WorkerThread.call(DataflowWorkerHarness.java:148)中。 parallel.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:617)at java.lang.Thread.run(Thread.java:745)引起:com .google.api.client.googleapis.json.GoogleJsonResponseException:503服务在com.google.api.client.client.google.is(google)上的com.google.api.client.google。恶作剧异常。来自(GoogleJsonResponseException.java:145)。 com.google.api.clie上的services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)位于com.google.api.client的com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(GbstractGoogleClientRequest.java:432)上的nt.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)。 go.comapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)位于com.google.api.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel $ com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469) UploadOperation.call(AbstractGoogleAsyncWriteChannel.java:357)... 4更多

(327e81fa21383d97):java.io.IOException:java.io.IOException:com.google.cloud.hadoop上com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.waitForCompletionAndThrowIfUploadFailed(AbstractGoogleAsyncWriteChannel.java:431)的管道损坏.util.AbstractGoogleAsyncWriteChannel.close(AbstractGoogleAsyncWriteChannel.java:289)位于com.google.cloud.dataflow.sdk的com.google.cloud.dataflow.sdk.runners.worker.TextSink $ TextFileWriter.close(TextSink.java:243) .util.common.worker.WriteOperation.finish(WriteOperation.java:100)位于com.google.cloud的com.google.cloud.dataflow.sdk.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77) com.google.cloud上的com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.doWork(DataflowWorker.java:191)中的.dataflow.sdk.runners.worker.DataflowWorker.executeWork(DataflowWorker.java:254) com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarne中的.dataflow.sdk.runners.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:144) ss $ WorkerThread.doWork(DataflowWorkerHarness.java:180)位于com.google.cloud.dataflow.sdk的com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness $ WorkerThread.call(DataflowWorkerHarness.java:161)。 runners.worker.DataflowWorkerHarness $ WorkerThread.call(DataflowWorkerHarness.java:148)at java.util.concurrent.FutureTask.run(FutureTask.java:266)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:617)at java.lang.Thread.run(Thread.java:745)引起:java.io.IOException:管道在java.io中被破坏。 Pi.InputStream.read(PipedInputStream.java:321),位于com.google.ap.api.client.util.ByteStreams.read(ByteStreams.java:181)的java.io.PipedInputStream.read(PipedInputStream.java:377) com.google.api.client.googleapis.media.MediaHttpUploader.resumableUpload(Medi)上的.api.client.googleapis.media.MediaHttpUploader.setContentAndHeadersOnCurrentRequest(MediaHttpUploader.java:629) aHttpUploader.java:409)com.google.api.client.google中提供了com.google.api.client.google.is(垃圾邮件)。 427)在com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)com的com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469) .google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel $ UploadOperation.call(AbstractGoogleAsyncWriteChannel.java:357)... 4更多

1 个答案:

答案 0 :(得分:0)

这里的问题来自Storage中的一些旧记录和旧的schemaVersion。这些都没有被清理,因此当使用包含旧模式版本的范围时,管道失败。