我试图建立一个首先执行countWindow的流。 countWindow发出的结果需要传递到另一个timeWindow。问题在于timeWindow没有发出结果。
我想出了一个非常简单的代码来演示问题:
val env = StreamExecutionEnvironment.getExecutionEnvironment
env.setStreamTimeCharacteristic(TimeCharacteristic.IngestionTime)
env
.addSource(new RichSourceFunction[Int] {
override def cancel(): Unit = {}
override def run(ctx: SourceFunction.SourceContext[Int]): Unit = {
var i = 0
while (true) {
println(s"Source emits element ${i}")
ctx.collect(i)
i = i + 1
Thread.sleep(1000)
}
}
})
.keyBy(new KeySelector[Int, String] {
override def getKey(value: Int): String = {
println("getKey 1")
"KEY1"
}
})
.countWindow(2, 1)
.reduce(new ReduceFunction[Int] {
override def reduce(value1: Int, value2: Int): Int = {
println("reduce 1")
value1
}
})
.keyBy(new KeySelector[Int, String] {
override def getKey(value: Int): String = {
println("getKey 2")
"KEY2"
}
})
.timeWindow(Time.seconds(5))
.reduce(new ReduceFunction[Int] {
override def reduce(value1: Int, value2: Int): Int = {
println("reduce 2")
value1
}
})
.print()
使用上面的代码,我希望每5秒输出一个元素。但是,事实并非如此。实际输出显示“打印”功能仅达到一次:
Source emits element 0
getKey 1
getKey 2
getKey 2
1> 0
Source emits element 1
getKey 1
getKey 1
reduce 1
getKey 2
getKey 2
Source emits element 2
getKey 1
getKey 1
reduce 1
getKey 2
getKey 2
Source emits element 3
getKey 1
getKey 1
reduce 1
getKey 2
getKey 2
Source emits element 4
getKey 1
getKey 1
reduce 1
getKey 2
getKey 2
Source emits element 5
getKey 1
getKey 1
reduce 1
getKey 2
getKey 2
Source emits element 6
getKey 1
getKey 1
reduce 1
getKey 2
getKey 2
Source emits element 7
getKey 1
getKey 1
reduce 1
getKey 2
getKey 2
Source emits element 8
getKey 1
getKey 1
reduce 1
getKey 2
getKey 2
Source emits element 9
getKey 1
getKey 1
reduce 1
getKey 2
getKey 2
Source emits element 10
getKey 1
getKey 1
reduce 1
getKey 2
getKey 2
Source emits element 11
getKey 1
getKey 1
reduce 1
getKey 2
getKey 2
Source emits element 12
getKey 1
getKey 1
reduce 1
getKey 2
getKey 2
答案 0 :(得分:0)
有趣的例子。如果您从IngestionTime更改为ProcessingTime,则示例将正确运行。
环顾四周,在调试器中,我看到的是IngestionTime,由CountWindow生成的StreamRecords不再具有有效的时间戳,因此TimeWindow无法正常运行。
要解决此问题,您需要在CountWindow之后重新建立时间戳和水印,如下所示:
...
.countWindow(2, 1)
.reduce(new ReduceFunction[Int] {
override def reduce(value1: Int, value2: Int): Int = {
println("reduce 1")
value1
}
})
.assignTimestampsAndWatermarks(new IngestionTimeExtractor[Int]())
.keyBy(new KeySelector[Int, String] {
override def getKey(value: Int): String = {
println("getKey 2")
"KEY2"
}
})
...
类似的技术也可以处理事件时间。