Beam / Dataflow:未定义窗口的拓扑的maxTimestamp

时间:2019-02-07 03:00:21

标签: google-cloud-dataflow apache-beam

maxTimestamp对于全局窗口的预期行为是什么? 我有一个带有无限来源的拓扑,该拓扑未指定窗口策略。当我访问BoundedWindow的maxTimestamp字段时,会得到一个将来的时间戳。这是预期的行为吗?

1 个答案:

答案 0 :(得分:3)

是的,这是预期的行为。全局窗口的末尾必须比Beam中可能的最大时间戳值小一些,在实践中通常称为+ infinity。

来自GlobalWindow.java的源代码:


 // Triggers use maxTimestamp to set timers' timestamp. Timers fires when
 // the watermark passes their timestamps. So, the maxTimestamp needs to be
 // smaller than the TIMESTAMP_MAX_VALUE.
 // One standard day is subtracted from TIMESTAMP_MAX_VALUE to make sure
 // the maxTimestamp is smaller than TIMESTAMP_MAX_VALUE even after rounding up
 // to seconds or minutes.
 private static final Instant END_OF_GLOBAL_WINDOW = extractMaxTimestampFromProto();