SourceOperationExecutor.isSplitOperationTooLargeForDataflowService的数据流管道中的NPE

时间:2016-07-04 16:41:02

标签: google-cloud-dataflow

我的数据流管道一直运行到最后一次运行。今天当我在一个新的数据集上运行它时,我开始得到NullPointerException。问题是异常似乎不是来自我的代码(在堆栈跟踪中的任何地方),如下所示 -

这是数据流框架中的错误还是(因为异常似乎发生在isSplitOperationTooLargeForDataflowService中),这个数据集,更确切地说是它上面的拆分,对于数据流而言太大了?

非常感谢任何帮助/见解!

2016-07-04T16:27:00.044Z: Error:   (fb0b4effcb8800a6):    
java.lang.NullPointerException
at com.google.cloud.dataflow.sdk.runners.worker.SourceOperationExecutor.isSplitOperationTooLargeForDataflowService(SourceOperationExecutor.java:100)
at com.google.cloud.dataflow.sdk.runners.worker.SourceOperationExecutor.isSplitResponseTooLarge(SourceOperationExecutor.java:92)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.doWork(DataflowWorker.java:227)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:146)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.doWork(DataflowWorkerHarness.java:164)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.call(DataflowWorkerHarness.java:145)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.call(DataflowWorkerHarness.java:132)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

1 个答案:

答案 0 :(得分:1)

这是在Dataflow SDK的1.4.0版本中修复的错误。在撰写本文时,最新版本的SDK为1.6.0。

如果它在版本1.2.1上显示“最新”,那么您似乎遇到了Eclipse插件的问题。如果手动更新pom.xml以使用SDK版本1.6.0,则应解决您的问题。