我刚遇到一个非常奇怪的问题,当使用带有时间戳和水印分配器的 EventTime 时,我无法从流窗口连接中获得任何结果。
我正在使用Kafka作为我的数据流源,并尝试了 AscendingTimestampExtractor 和自定义分配器,它们实现了Flink documentation here提到的 AssignerWithPeriodicWatermarks ,以及我所拥有的经过测试,没有水印,也没有生成连接结果。如果我更改为使用 ProcessingTime 和 TumblingProcessingTimeWindows 而没有任何时间戳分配器,那么我可以得到正确的结果。
自定义时间戳和水印分配器的代码如下:
FlinkKafkaConsumer09<String> myConsumer1 =
new FlinkKafkaConsumer09<>(myTopic1, new SimpleStringSchema(), props);
myConsumer1.assignTimestampsAndWatermarks(new MyTimestampsAndWatermarks());
FlinkKafkaConsumer09<String> myConsumer2 =
new FlinkKafkaConsumer09<>(myTopic2, new SimpleStringSchema(), props);
myConsumer2.assignTimestampsAndWatermarks(new MyTimestampsAndWatermarks());
...
public static class MyTimestampsAndWatermarks implements AssignerWithPeriodicWatermarks<String> {
private long currentMaxTimestamp;
@Override
public long extractTimestamp(String element, long previousElementTimestamp) {
long timestamp = myFunctionToGetMillisFromString(element);
currentMaxTimestamp = Math.max(timestamp, currentMaxTimestamp);
return timestamp;
}
@Override
public Watermark getCurrentWatermark() {
return new Watermark(currentMaxTimestamp - 1L);
}
}
...
DataStream<myPOJO1> stream1 = env.addSource(myConsumer1).map(new MyMapper1());
DataStream<myPOJO2> stream2 = env.addSource(myConsumer2).map(new MyMapper2());
stream1.join(stream2)
.where(new KeySelector1())
.equalTo(new KeySelector2())
.window(TumblingEventTimeWindows.of(Time.seconds(windowSize)))
.apply(new JoinFunction<AdClick, GameCreate, TransferResult>() {...});
我的AscendingTimestampExtractor代码如下:
FlinkKafkaConsumer09<String> myConsumer1 =
new FlinkKafkaConsumer09<>(myTopic1, new SimpleStringSchema(), props);
myConsumer1.assignTimestampsAndWatermarks(new AscendingTimestampExtractor<String>() {
@Override
public long extractAscendingTimestamp(String element) {
return myFunctionToGetMillisFromString(element);
}
});
FlinkKafkaConsumer09<String> myConsumer2 =
new FlinkKafkaConsumer09<>(myTopic2, new SimpleStringSchema(), props);
myConsumer2.assignTimestampsAndWatermarks(new AscendingTimestampExtractor<String>() {
@Override
public long extractAscendingTimestamp(String element) {
return myFunctionToGetMillisFromString(element);
}
});
...
DataStream<myPOJO1> stream1 = env.addSource(myConsumer1).map(new MyMapper1());
DataStream<myPOJO2> stream2 = env.addSource(myConsumer2).map(new MyMapper2());
stream1.join(stream2)
.where(new KeySelector1())
.equalTo(new KeySelector2())
.window(TumblingEventTimeWindows.of(Time.seconds(windowSize)))
.apply(new JoinFunction<AdClick, GameCreate, TransferResult>() {...});
感谢您的帮助!
答案 0 :(得分:0)
myConsumer3 = myConsumer1.assign *** myConsumer4 = myConsumer2.assign ***
并使用myConsumer3 / myConsumer4,这将是正确的
答案 1 :(得分:0)
我遇到了同样的问题,这是一个非常愚蠢的错误,我找到了解决方法here:
写时:
myConsumer1.assignTimestampsAndWatermarks(new MyTimestampsAndWatermarks());
它创建一个新的数据流,而不是修改该流,并且您没有将其存储在变量中。 所以底线是:
将其存储在新的数据流中,并将联接应用于此数据流(将为其分配这些时间戳和水印)。