Flink的CoProcessFunction不会触发onTimer

时间:2018-03-05 17:48:24

标签: apache-flink flink-streaming

我尝试聚合两个像这样的流

val joinedStream = finishResultStream.keyBy(_.searchId)
  .connect(startResultStream.keyBy(_.searchId))
  .process(new SomeCoProcessFunction)

然后在SomeCoProcessFunction类中处理它们

class SomeCoProcessFunction extends CoProcessFunction[SearchFinished, SearchCreated, SearchAggregated] {

   override def processElement1(finished: SearchFinished, ctx: CoProcessFunction[SearchFinished, SearchCreated, SearchAggregated]#Context, out: Collector[SearchAggregated]): Unit = { 

       // aggregating some "finished" data ...

   }

   override def processElement2(created: SearchCreated, ctx: CoProcessFunction[SearchFinished, SearchCreated, SearchAggregated]#Context, out: Collector[SearchAggregated]): Unit = {

       val timerService = ctx.timerService()
       timerService.registerEventTimeTimer(System.currentTimeMillis + 5000)

       // aggregating some "created" data ...
   }

   override def onTimer(timestamp: Long, ctx: CoProcessFunction[SearchFinished, SearchCreated, SearchAggregated]#OnTimerContext, out: Collector[SearchAggregated]): Unit = {

       val watermark: Long = ctx.timerService().currentWatermark()
       println(s"watermark!!!! $watermark")

       // clean up the state

   }

我想要的是在一定时间(5000毫秒)之后清理状态,这就是onTimer必须使用的状态。但既然它永远不会被解雇,我有点问自己,我在这里做错了什么?

提前感谢任何提示。

更新

解决方案是设置timeService(tnx到fabian-hueske和Beckham):

timerService.registerProcessingTimeTimer(timerService.currentProcessingTime() + 5000)

我仍然没有真正弄清楚timerService.registerEventTimeTimer做了什么,水印ctx.timerService().currentWatermark()始终显示-9223372036854775808现在问题是在EventTimer注册之前多久。

2 个答案:

答案 0 :(得分:2)

我发现您使用的TimeCharacteristic可能与Flink作业使用的ctx.timestamp()(事件时间,处理时间,摄取时间)不同。

尝试获取事件import inspect class BindableConstructor(object): def __init__(self, meth): self.meth = meth self.sig = inspect.signature(self.meth) def __get__(self, obj, klass=None): if obj is not None: print('Method ', repr(self.meth), ' called from instance ', repr(obj)) if klass is None: klass = type(obj) def newmeth(*args, **kwargs): ba = self.sig.bind_partial(*args, **kwargs) ba.apply_defaults() for paramname in self.sig.parameters: if paramname not in ba.arguments and hasattr(obj, paramname): ba.arguments[paramname] = getattr(obj, paramname) return self.meth(klass, *ba.args, **ba.kwargs) return newmeth 的时间戳,然后在其上添加5000毫秒。

答案 1 :(得分:1)

问题是您正在使用处理时间戳(timerService.registerEventTimeTimer)注册事件时间计时器(System.currentTimeMillis + 5000)。

System.currentTimeMillis返回当前机器时间,但事件时间不是基于机器时间,而是基于水印计算的时间。

您应该注册处理定时器或使用事件时间戳记注册事件时间定时器。您可以从Context对象获取当前水印的时间戳或当前记录的时间戳,该对象作为参数传递给processElement1()processElement2()