我想使用FlinkCEP只做一个懒惰的'匹配模式。我怎样才能做到这一点? 例如我有一个输入流ACABCABCB,我希望在A followBy C上匹配,只获得3场比赛而不是6场比赛。
我创建了以下示例来说明我的问题。
val env = StreamExecutionEnvironment.createLocalEnvironment(1)
env.setStreamTimeCharacteristic(TimeCharacteristic.ProcessingTime)
case class MyEvent(id: Int, kind: String, value: String)
case class MyAggregatedEvent(id: Int, concatenatedValue: String)
val eventStream = env.fromElements(
MyEvent(1, "A", "1"), MyEvent(1, "C", "1"),
MyEvent(1, "A", "2"), MyEvent(1, "B", "1"), MyEvent(1, "C", "2"),
MyEvent(1, "A", "3"), MyEvent(1, "D", "2"), MyEvent(1, "C", "3"),
MyEvent(1, "B", "3")
)
val pattern: Pattern[MyEvent, _] = Pattern
.begin[MyEvent]("pA").where(e => e.kind == "A")
.next("pC").where(e => e.kind == "C")
.within(Time.seconds(5))
val patternNextStream: PatternStream[MyEvent] = CEP.pattern(eventStream.keyBy(_.id), pattern)
val outNextStream: DataStream[MyAggregatedEvent] = patternNextStream.flatSelect {
(pattern: scala.collection.mutable.Map[String, MyEvent], collector: Collector[MyAggregatedEvent]) =>
val partA = pattern.get("pA").get
val partC = pattern.get("pC").get
collector.collect(MyAggregatedEvent(partA.id, partA.value + "=>" + partC.value))
}
outNextStream.print()
env.execute("Experiment")
这给了我以下输出:
MyAggregatedEvent(1,1 =→1)
当我将模式更改为:
时val pattern: Pattern[MyEvent, _] = Pattern
.begin[MyEvent]("pA").where(e => e.kind == "A")
.followedBy("pC").where(e => e.kind == "C")
.within(Time.seconds(5))
然后打印以下内容:
MyAggregatedEvent(1,1 =→1)
MyAggregatedEvent(1,1 =→2)
MyAggregatedEvent(1,2 =→2)
MyAggregatedEvent(1,1 =→3)
MyAggregatedEvent(1,2 =→3)
MyAggregatedEvent(1,3 =→3)
如何创建仅匹配每个事件一次的模式,以便我的输出为:
MyAggregatedEvent(1,1 =→1)
MyAggregatedEvent(1,2 =→2)
MyAggregatedEvent(1,3 =→3)
答案 0 :(得分:1)
目前Flink的CEP库不支持此功能。匹配语义尚不能控制。我认为最好先添加MATCH_ALL
和匹配MATCH_FIRST
模式。一旦看到完全匹配的序列,MATCH_FIRST
就会丢弃所有中间状态。这应该涵盖您的用例。