Beam TextIO写入NullPointerException,因为destination为null

时间:2018-02-27 13:00:34

标签: java apache-beam

我正在使用Apache Beam DirectRunner,我已按如下方式定义了一个管道:

val p = Pipeline.create(options)
p.apply(Create.of("/tmp/dc/foo.txt"))
        .apply(FileLoader())
        .apply(SaveLineToRedis())
        .apply(AddToRedisIndex())
        .apply(MatchTransform())
        .apply(GroupByKey.create())
        .apply(TextIO.writeCustomType<KV<String, Iterable<SimpleMatcherResult>>>().to("/tmp/bar"))

写出失败了:

13:45:47.247 [direct-runner-worker] INFO  org.apache.beam.sdk.io.WriteFiles - Opening writer 83c36e3f-7e1f-406c-a9c6-f3ab4bac1cb7 for window org.apache.beam.sdk.transforms.windowing.GlobalWindow@29caf222 pane PaneInfo{isFirst=true, isLast=true, timing=ON_TIME, index=0, onTimeIndex=0} destination null
Exception in thread "main" org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.lang.NullPointerException
        at org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:342)
        at org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:312)
        at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:206)
        at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:62)
        at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
        at de.techmatrix.dc.matcher.MainKt.main(Main.kt:12)
Caused by: java.lang.NullPointerException
        at org.apache.beam.sdk.io.DynamicFileDestinations$ConstantFilenamePolicy.formatRecord(DynamicFileDestinations.java:49)
        at org.apache.beam.sdk.io.WriteFiles$WriteShardsIntoTempFilesFn.processElement(WriteFiles.java:718)

目标由DynamicFileDestinations null返回:

public Void getDestination(UserT element) {
  return (Void) null;
}

更新:这适用于FileIO

        .apply(FileIO.writeDynamic<String, KV<String, Iterable<SimpleMatcherResult>>>()
                .by { it.value.first().matchedKey }.withDestinationCoder(StringUtf8Coder.of())
                .via(Contextful.fn({ mapper.writeValueAsString(it) }), TextIO.sink())
                .to("/tmp/bar")
                .withNaming{ _ -> defaultNaming("matches", "txt")})

有人可以解释原因吗?

2 个答案:

答案 0 :(得分:0)

使用TextIO.writeCustomType()隐式使用DynamicDestinations。调用TextIO.writeCustomType().to("/tmp/bar")只是为要写入的文件设置文件名前缀。请参阅TextIO.TypedWrite.to()定义。你真的需要写一个动态目的地吗?

您只需使用TextIO.Write转换即可写入文本文件。请参阅TextIO定义以获取参考和示例。您需要额外的转换步骤才能将KV<String, Iterable<SimpleMatcherResult>>转换为String类型。

答案 1 :(得分:0)

在第一个代码段中,您没有指定withFormatFunction()(并且Beam无法验证此内容并提供更好的错误消息)。 NPE来自this line调用(缺失)格式函数。

在第二个代码段中,指定了 - Contextful.fn({ mapper.writeValueAsString(it) }) - 所以它可以正常工作。