我有基于会话窗口的用例,其中会话间隔= 10分钟,并且触发器会在累积模式下添加到会话窗口的每个元素上发出。
dataflow版本2.1.0
sessionCollection.apply("GroupSessionsByKey",
Window.<KV<String, Event>>into(Sessions.withGapDuration(Duration.standardMinutes(10)))
.triggering(
AfterPane.elementCountAtLeast(1)).accumulatingFiredPanes());
当我尝试使用DirectRunner在本地运行管道时出现以下错误。
Exception in thread "main" java.lang.NullPointerException: Outputs for non-root node GroupSessionsByKey are null
at org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:864)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:606)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:594)
at org.apache.beam.sdk.runners.TransformHierarchy$Node.access$500(TransformHierarchy.java:276)
at org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:210)
at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:440)
at org.apache.beam.sdk.Pipeline.validate(Pipeline.java:552)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:296)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:283)
at com.homedepot.personalization.orangestream.pipelines.OrangeStreamPipeline.execute(OrangeStreamPipeline.java:56)
at com.homedepot.personalization.orangestream.main.OrangeStreamMain.main(OrangeStreamMain.java:27)
=============================================== ==========================
这里有完整的转换,它仍然给我NPE
PCollection<KV<String, Event>> sessionCollection = inputCollection.apply("Process Incoming Events", ParDo.of(new SomeDoFn()));
// Session based windowing
PCollection<KV<String, Event>> sessionBasedPCollection = sessionCollection.apply("GroupSessionsByKey", Window.<KV<String, Event>>into(Sessions.withGapDuration(Duration.standardMinutes(10))).triggering(AfterPane.elementCountAtLeast(1).accumulatingFiredPanes());
// Group by key
PCollection<KV<String, Iterable<Event>>> groupByKeyCollection = sessionBasedPCollection.apply("GroupingElementsByKey", GroupByKey.create());
// Process and extract elements and persist to bigtable
PCollection<String> elCollectionPerSession = groupByKeyCollection.apply("AccumulateElementsPerSession", ParDo.of(new AccumulateElementsPerSession()));
// Write to DB to verify
elCollectionPerSession.apply("WriteToBigTable", ParDo.of(new WriteSessionBasedWindowingElementsToBigTable("project", "instance_name", "table")));