我在Google云数据流中遇到以下错误:
java.lang.RuntimeException:com.google.cloud.dataflow.sdk.util.UserCodeException:java.lang.RuntimeException:com.google.cloud.dataflow.sdk.util.UserCodeException:java.lang.RuntimeException:com .google.cloud.dataflow.sdk.util.UserCodeException:java.lang.RuntimeException:java.lang.RuntimeException:com.google.cloud.dataflow.sdk.util.UserCodeException:java.lang.RuntimeException:java.lang.RuntimeException :com.google.cloud.dataflow.sdk.coders.CoderException:无法在com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn $ 1.output(SimpleParDoFn.java:162)处对null字符串进行编码 com.google.cloud.dataflow.sdk.util.DoFnRunnerBase $ DoFnContext.outputWindowedValue(DoFnRunnerBase.java:287)at at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase $ DoFnProcessContext.output(DoFnRunnerBase.java:449)at at reports.transforms.JsonToObject.processElement(JsonToObject.java:35)
引起:com.google.cloud.dataflow.sdk.util.UserCodeException:java.lang.RuntimeException:com.google.cloud.dataflow.sdk.util.UserCodeException:java.lang.RuntimeException:com.google。 cloud.dataflow.sdk.util.UserCodeException:java.lang.RuntimeException:java.lang.RuntimeException:com.google.cloud.dataflow.sdk.util.UserCodeException:java.lang.RuntimeException:java.lang.RuntimeException:com。 google.cloud.dataflow.sdk.coders.CoderException:无法在com.google.cloud.dataflow.sdk上的com.google.cloud.dataflow.sdk.util.UserCodeException.wrap(UserCodeException.java:35)中对null字符串进行编码.util.UserCodeException.wrapIf(UserCodeException.java:40)at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.wrapUserCodeException(DoFnRunnerBase.java:368)com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner .invokeProcessElement(SimpleDoFnRunner.java:51)位于com.google.cloud.dataflow.sdk.runners.worker的com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:138)。 SimpleParDoFn.processElement(SimpleParDoFn.java:190)位于com.google.cloud.dataflow.sdDo.junners.worker的com.google.cloud.dataflow.sdk.runners.worker.ForwardingParDoFn.processElement(ForwardingParDoFn.java:42)。 DataflowWorkerLoggingParDoFn.processElement(DataflowWorkerLoggingParDoFn.java:47)位于com.google.cloud.dataflow.sdk.util.common.worker.ParDoOperation.process(ParDoOperation.java:53)at com.google.cloud.dataflow.sdk.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)at at com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn $ 1.output(SimpleParDoFn.java:160)at at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase $ DoFnContext.outputWindowedValue(DoFnRunnerBase.java:287)at at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase $ DoFnProcessContext.output(DoFnRunnerBase.java:449)at at reports.transforms.JsonToObject.processElement(JsonToObject.java:35)at com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:49)at at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:138)
在我的班级(JsonToObject)中,我执行以下操作:
if(obj!= null){ processContext.output(OBJ); }
异常抛出的地方。
知道为什么会这样吗?
答案 0 :(得分:1)
NullableCoder是一个复合编码器,要求用另一个编码器来指定它。 @DefaultCoder与复合编码器(KvCoder,IterableCoder,...)不兼容,因为这个要求由另一个编码器参数化。解决问题的一种方法是在每个可能手动包含可空类型的PCollection上设置编码器。例如:
PCollection<String> pc = pipeline.apply(... transform that produces nulls ...);
pc.setCoder(NullableCoder.of(StringUtf8Coder.of());