我有一些设备状态更改的流,例如:case class DeviceState(ts: Long, state: Int)
。设备仅在更改后才发送状态。因此,例如,它可能是这样的:
ts | state
----------
0 | ONLINE
3 | OFFLINE
11 | ONLINE
19 | OFFLINE
(在实际代码ts
中是unix时间毫秒,出于示例目的,我将其简化了)
我想通过滚动10个刻度的窗口来划分此流,并计算每个状态的总持续时间,因此,例如,如果标点是在刻度45处发出的,则结果应如下所示:
window | state | duration
-----------------------------
0 - 10 | ONLINE | 3
0 - 10 | OFFLINE | 7
10 - 20 | OFFLINE | 2
10 - 20 | ONLINE | 8
20 - 30 | OFFLINE | 10
30 - 40 | OFFLINE | 10
是否可以在Flink中进行这样的持续时间计算?我认为可以通过自定义的reduce函数来实现,但是我无法弄清楚如何发出最后一个状态,因此它将出现在每个窗口中(在上面的示例中,最后一个状态位于第19跳时,但仍应在Windows 20-30、30-40等)。
答案 0 :(得分:0)
With Flink's window API, a window doesn't exist until an event is assigned to it, which makes what you are trying to do more difficult.
One solution might be to use a ProcessFunction with a timer to mix into your stream a third type of event that's only used to trigger the windows that would otherwise be empty.
Another solution would be to do all the work of computing the analytics with a ProcessFunction (with some state and timers), rather than windows.