Question

我正在使用以下管道：

inputStream.keyBy(<keyMapper>).
connect(configurationBroadcastStream).
process(new KeyedBroadcastProcessFunction<...>() {
     processBroadcastElement(...){...}
     processElement(...){...}
     }).
keyBy(<keyMapper>). // have to key output of process() again
window(DynamicEventTimeSessionWindow.withDynamicGap(...)).
trigger(new CustomTrigger()).
process(new CustomProcessWindowFn())

在CustomTrigger()中，我正在注册一个eventTimeTimer()，它将触发以指示窗口的结尾。问题在于，即使在以下情况下，也不会从不调用onEventTime()方法：

我确保env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
我使用ascendingTimestampExtractor()发送了一个事件，该事件肯定将水印推得足够远，以至于eventTimeTimer()应该触发。

我想念什么？它与丢失的水印和onTimer()的{{1}}方法有关吗？由于this中的大卫·安德森（David Anderson）的评论，我怀疑是这样答案：

为非广播流添加特殊的伪造水印（设置为 Watermark.MAX_WATERMARK）

我还没有实现名为 Timer 的方法。但是，如果确实如此，我不知道这与下游触发器有什么关系。谢谢。

编辑：此场景的完整示例为here。

Answer 1

是的，问题在于广播流没有水印。（但不，KeyedBroadcastProcessFunction是否具有onTimer方法都没有关系。一旦水印流了，它们就会流到窗口中。）

只要操作员有两个或两个以上输入（因此，在您的情况下，当inputStream和configurationBroadcastStream连接时），该操作员处的水印将最小化其输入中的水印。由于广播流没有水印，因此这将阻止inputStream提供的水印。

我有一个example，显示了您可能如何处理。假设您的广播流不需要任何时间信息，则可以实现一个时间戳提取器和水印分配器，以将水印控制权有效地移交给其他流。像这样：

// Once the two streams are connected, the Watermark of the KeyedBroadcastProcessFunction operator
// will be the minimum of the Watermarks of the two connected streams. Our config stream has a default
// Watermark at Long.MIN_VALUE, and this will hold back the event time clock of the
// KeyedBroadcastProcessFunction, unless we do something about it.

public static class ConfigStreamAssigner implements AssignerWithPeriodicWatermarks<String> {
    @Nullable
    @Override
    public Watermark getCurrentWatermark() {
        return Watermark.MAX_WATERMARK;
    }

    @Override
    public long extractTimestamp(String element, long previousElementTimestamp) {
        return 0;
    }
}

使用BroadcastState模式时，如何触发下游onEventTime（）方法？

1 个答案: