使用触发器进行基于会话的窗口时的NPE

时间:2017-12-07 16:08:12

标签: google-cloud-dataflow apache-beam

我有基于会话窗口的用例,其中会话间隔= 10分钟,并且触发器会在累积模式下添加到会话窗口的每个元素上发出。

dataflow版本2.1.0

sessionCollection.apply("GroupSessionsByKey", 
         Window.<KV<String, Event>>into(Sessions.withGapDuration(Duration.standardMinutes(10)))
               .triggering(                              
                     AfterPane.elementCountAtLeast(1)).accumulatingFiredPanes());

当我尝试使用DirectRunner在本地运行管道时出现以下错误。

Exception in thread "main" java.lang.NullPointerException: Outputs for non-root node GroupSessionsByKey are null
    at org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:864)
    at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:606)
    at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:594)
    at org.apache.beam.sdk.runners.TransformHierarchy$Node.access$500(TransformHierarchy.java:276)
    at org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:210)
    at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:440)
    at org.apache.beam.sdk.Pipeline.validate(Pipeline.java:552)
    at org.apache.beam.sdk.Pipeline.run(Pipeline.java:296)
    at org.apache.beam.sdk.Pipeline.run(Pipeline.java:283)
    at com.homedepot.personalization.orangestream.pipelines.OrangeStreamPipeline.execute(OrangeStreamPipeline.java:56)
    at com.homedepot.personalization.orangestream.main.OrangeStreamMain.main(OrangeStreamMain.java:27)

=============================================== ==========================

这里有完整的转换,它仍然给我NPE

PCollection<KV<String, Event>> sessionCollection = inputCollection.apply("Process Incoming Events", ParDo.of(new SomeDoFn()));

// Session based windowing
PCollection<KV<String, Event>> sessionBasedPCollection = sessionCollection.apply("GroupSessionsByKey", Window.<KV<String, Event>>into(Sessions.withGapDuration(Duration.standardMinutes(10))).triggering(AfterPane.elementCountAtLeast(1).accumulatingFiredPanes());

// Group by key
PCollection<KV<String, Iterable<Event>>> groupByKeyCollection = sessionBasedPCollection.apply("GroupingElementsByKey", GroupByKey.create());

// Process and extract elements and persist to bigtable
PCollection<String> elCollectionPerSession = groupByKeyCollection.apply("AccumulateElementsPerSession", ParDo.of(new AccumulateElementsPerSession()));

// Write to DB to verify
elCollectionPerSession.apply("WriteToBigTable", ParDo.of(new WriteSessionBasedWindowingElementsToBigTable("project", "instance_name", "table")));

0 个答案:

没有答案