是否可以在运行时配置flink应用程序?例如,我有一个流应用程序,它读取输入,进行一些转换,然后过滤掉低于某个阈值的所有元素。但是,我希望此阈值在运行时可配置,这意味着我可以在不重新启动flink作业的情况下更改此阈值。示例代码:
DataStream<MyModel> myModelDataStream = // get input ...
// do some stuff ...
.filter(new RichFilterFunction<MyModel>() {
@Override
public boolean filter(MyModel value) throws Exception {
return value.someValue() > someGlobalState.getThreshold();
}
})
// write to some sink ...
DataStream<MyConfig> myConfigDataStream = // get input ...
// ...
.process(new RichProcessFunction<MyConfig>() {
someGlobalState.setThreshold(MyConfig.getThreshold());
})
// ...
是否有可能实现这一目标?就像可以通过配置流更改的全局状态一样。
答案 0 :(得分:4)
是的,您可以使用RichCoFlatMap
执行此操作。大致像这样:
DataStream<MyModel> myModelDataStream = // get input ...
DataStream<Long> controlStream = // get input ...
DataStream<MyModel> result = controlStream
.broadcast()
.connect(myModelDataStream)
.flatMap(new MyCoFlatMap());
public class MyCoFlatMap extends RichCoFlatMapFunction<Long, MyModel, MyModel> {
private ValueState<Long> threshold;
@Override
public void open(Configuration conf) {
ValueStateDescriptor<Long> descriptor =
new ValueStateDescriptor<>("configuration", Long.class);
threshold = getRuntimeContext().getState(descriptor);
}
@Override
public void flatMap1(Long newthreshold, Collector<MyModel> out) {
threshold.update(newthreshold);
}
@Override
public void flatMap2(MyModel model, Collector<MyModel> out) {
if (threshold.value() == null || model.getData() > threshold.value()) {
out.collect(model);
}
}
}