我的DF工作失败我相信View.asSingleton()
,阶段失败了4次因此失败了整个工作:
(d373a0bb7c7bad6f): java.lang.IllegalArgumentException: IsmSinkWriter expects keys to be written in strictly increasing order but was given RandomAccessData{buffer=[], size=0} as the previous key and RandomAccessData{buffer=[], size=0} as the current key. Expected 0 <= 0 at position 1. at com.google.cloud.dataflow.sdk.runners.worker.IsmSink$IsmSinkWriter.commonPrefixLengthWithOrderCheck(IsmSink.java:209) at com.google.cloud.dataflow.sdk.runners.worker.IsmSink$IsmSinkWriter.add(IsmSink.java:166) at com.google.cloud.dataflow.sdk.runners.worker.IsmSink$IsmSinkWriter.add(IsmSink.java:85) at com.google.cloud.dataflow.sdk.util.common.worker.WriteOperation.process(WriteOperation.java:90) at com.google.cloud.dataflow.sdk.util.common.worker.OutputReceiver.process(OutputReceiver.java:52) at com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:161) at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnContext.outputWindowedValue(DoFnRunnerBase.java:288) at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnProcessContext.output(DoFnRunnerBase.java:450) at com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner$BatchViewAsSingleton$IsmRecordForSingularValuePerWindowDoFn.processElement(DataflowPipelineRunner.java:825)
我尝试从PCollectionView
创建PCollection[CMS[String]]
- 集合中只有一个元素(其大小约为3.75MiB
)。
帮忙吗?
更新1:当我将视图的单个元素的大小减小到1.88 MB
但255.29 KB
(和更小)成功时,应用程序失败了 - 闻起来有点像某些(未)记录的我错过的限制或错误?
答案 0 :(得分:0)
版本1.5.0和1.5.1现已修复。
批量模式下1.5.0和1.5.1的全局窗口单例受到一个错误的影响,它们无法实现大小超过1MB的单例。建议用户使用View.asIterable()或View.asList()作为解决方法,因为它没有受到影响。