Apache flink - 使用TumblingProcessingTimeWindow和TimeCharacteristic.EventTime

时间:2018-06-18 17:21:02

标签: java apache-flink

看起来TumblingProcessingTimeWindow总是使用“摄取时间”。 有没有办法强制在事件时间窗口?

我的用例非常简单我收到包含“事件时间戳”的事件,并希望根据事件时间对它们进行汇总。

E.g。在下面的代码中我期望2个输出:

public class WindowExample {

private static final SimpleDateFormat FORMAT = new SimpleDateFormat("HH:mm:ss");

public static void main(String[] args) throws Exception {
    final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

    env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
    DataStreamSource<Bean> beans = env.fromElements(
        new Bean(1, 1, "12:00:00"),
        new Bean(1, 2, "12:00:03"),
        new Bean(1, 1, "12:00:04"),  //window of 3 sec trigger here
        new Bean(1, 2, "12:00:05"),
        new Bean(1, 3, "12:00:06"),
        new Bean(1, 3, "12:00:07")   //window of 3 sec trigger here
    );

    beans.assignTimestampsAndWatermarks(new AscendingTimestampExtractor<Bean>() {
        @Override public long extractAscendingTimestamp(Bean element) {
            return element.getTs();
        }
    })
        .keyBy("id")
        .window(TumblingProcessingTimeWindows.of(Time.seconds(3)))
        .max("value")
        .addSink(new SinkFunction<Bean>() {

            @Override public void invoke(Bean value, Context context) {
                System.out.println("Sync on: "+value);
            }
        });
    env.execute("Windowing test");
}

public static class Bean {

    private int id;
    private int value;
    private long ts;

    public Bean() {
    }

    Bean(int id, int value, String time) throws ParseException {
        this.id = id;
        this.value = value;
        this.ts = FORMAT.parse(time).toInstant().toEpochMilli();
    }

    long getTs() {
        return ts;
    }
    // other getters and setters
}

}

1 个答案:

答案 0 :(得分:1)

Flink允许将处理时间窗口与事件时间流一起使用,因为有合法的用例。但如果你确实想要事件时间窗口,你需要要求它。在这种情况下,您应该使用TumblingEventTimeWindows