我正在尝试生成重复数据删除事件流,而不指定任何超出重复数据删除的窗口策略。在我的查询中使用output first every
子句似乎具有所需的效果,但在这些查询直接插入流中时则不会。
对于下面给出的例子,假设我试图在4小时的窗口中仅检测每辆车的第一首喇叭声。
(define-event-type! "CarEvent"
{:license_plate java.lang.String})
(define-event-type! "HonkEvent"
{:volume java.lang.Integer}
:supertypes #{"CarEvent"})
(define-variant! "HonkEventDeduplicated" "HonkEvent")
(define-statement! "context-IndividualCarContext"
"create context IndividualCarContext partition by license_plate from CarEvent")
(define-statement! "populate-HonkEventDeduplicated"
"context IndividualCarContext
insert into HonkEventDeduplicated
select * from HonkEvent
group by license_plate
output first every 4 hours")
但是 - select * from HonkEventDeduplicated
会在每一次鸣喇叭事件中触发,即使同一辆车连续两次鸣喇叭。
答案 0 :(得分:2)
不是使用output first every
子句过滤,而是可以使用std:firstunique
视图完成此操作:
(define-statement!
"populate-HonkEventDeduplicated"
"insert into HonkEventDeduplicated
select * from HonkEvent.win:time(4 hours).std:firstunique(license_plate)")