窗口未完成窗口长度

时间:2017-01-17 14:13:18

标签: java apache-flink flink-streaming

我一直在尝试关于flink窗口的示例,为了验证窗口的时间,我在流事件中添加了一个时间戳。我发现窗口的持续时间小于窗口长度。此外,如果我使用滑动窗口并修改事件,我会将修改后的事件放入下一个窗口。

当我指定窗口长度时,它是否等待窗口完成?滑动窗口之间的重叠事件是指同一个实例? (我知道流是不可变的结构)

public class WindowDemo {

public static void main(String[] args) {
    StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    env.setStreamTimeCharacteristic(TimeCharacteristic.IngestionTime);

    Properties prop=PropertyLoader.loadPropertiesForConsumer("WC",0);
    FlinkKafkaConsumer09<Alarm> consumer= new FlinkKafkaConsumer09<Alarm>("topic_smartEmse", new AlarmSchema(), prop);
    DataStream<Alarm> inputStream= env.addSource(consumer);

    inputStream= inputStream.flatMap(new FlatMapFunction<Alarm, Alarm>() {

        @Override
        public void flatMap(Alarm value, Collector<Alarm> out)
                throws Exception {
            System.out.println("flatMap Started at "+System.currentTimeMillis());
            value.setUserDefined10("IN TIME "+System.currentTimeMillis());
            out.collect(value);
            System.out.println("flatMap Ended at "+System.currentTimeMillis());
        }
    });

    KeyedStream<Alarm, String> keyedStream= inputStream.keyBy(new KeySelector<Alarm, String>(){

        @Override
        public String getKey(Alarm value) throws Exception {
            System.out.println("getKey Started at "+System.currentTimeMillis());
            return "XX";
        }});

    DataStream<Alarm> dataStream= keyedStream.timeWindow(Time.of(90, TimeUnit.SECONDS)).apply(new WindowFunction<Alarm, Alarm, String, TimeWindow>() {

        @Override
        public void apply(String key, TimeWindow window,
                Iterable<Alarm> input, Collector<Alarm> out)
                throws Exception {
            System.out.println("timeWindow Started at "+System.currentTimeMillis());
            int count=0;
            System.out.println("Key : "+key);
            System.out.println("Values : "+input);
            Iterator<Alarm> itr= input.iterator();
            while (itr.hasNext()){
                Alarm alarm= itr.next();
                alarm.setUserDefined1(""+count++);

                out.collect(alarm);
            }
            System.out.println("timeWindow ended at "+System.currentTimeMillis());

        }
    });

    dataStream= dataStream.flatMap(new FlatMapFunction<Alarm, Alarm>() {

        @Override
        public void flatMap(Alarm value, Collector<Alarm> out)
                throws Exception {
            value.setUserDefined11("OUT TIME "+System.currentTimeMillis());
            out.collect(value);
        }
    });
    dataStream.printToErr();
    try {
        env.execute();
    } catch (Exception e) {
        e.printStackTrace();
    }
}
}

1 个答案:

答案 0 :(得分:1)

如果我说得对你关注的是窗口在给定的时间范围结束之前评估( apply )。我注意到窗口的第一次评估效果相同。似乎时间段以某种方式对齐。我在19:09:13开始处理,第一次评估窗口是在19:10:30,所以在77秒后。在第一次调用之后,窗口关闭不完全,但非常接近每90秒。

对于TumblingProcessingTimeWindows(您正在使用)它似乎是这段代码:

public class TumblingProcessingTimeWindows extends WindowAssigner<Object, TimeWindow> {

    private long size;

    private TumblingProcessingTimeWindows(long size) {
        this.size = size;
    }

    @Override
    public Collection<TimeWindow> assignWindows(Object element, long timestamp, WindowAssignerContext context) {

        final long now = context.getCurrentProcessingTime();
        // here goes the alignment 
        long start = now - (now % size);
        return Collections.singletonList(new TimeWindow(start, start + size));
    }

这对你有意义吗?