Flink CEP贪婪匹配

时间:2017-12-29 20:25:25

标签: apache-flink flink-cep

我与Flink CEP贪婪的运营商进行了一场战斗。

给出以下java代码:

    final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

    List<String> strings = Arrays.asList("1,3,5,5,5,5,6,".split(","));

    DataStream<String> input = env.fromCollection(strings);

    Pattern<String, ?> pattern = Pattern.<String>
    begin("start").where(new SimpleCondition<String>() {
        @Override
        public boolean filter(String value) throws Exception {
            return value.equals("5");
        }
    }).oneOrMore().greedy()
    .followedBy("end").where(new SimpleCondition<String>() {

        @Override
        public boolean filter(String value) throws Exception {
            return value.equals("6");
        }
    });

    PatternStream<String> patternStream = CEP.pattern(input, pattern);

    DataStream<String> result = patternStream.select(new PatternSelectFunction<String, String>() {
        @Override
        public String select(Map<String, List<String>> pattern) throws Exception {
            System.err.println("=======");
            pattern.values().forEach(match -> match.forEach(event -> System.err.println(event)));
            System.err.println("=======");
            return "-";
        }
    });

    result.print();
    env.execute("Flink Streaming Java API Skeleton");

我想看看:只发出“5 5 5 5 6”

然而,它匹配“5 5 5 5 6”,“5 5 5 6”,“5 5 6”,“5 6”

如果我这样做:

    begin("start").where(new SimpleCondition<String>() {
        @Override
        public boolean filter(String value) throws Exception {
            return value.equals("3");
        }
    }).followedBy("middle").where(new SimpleCondition<String>() {
        @Override
        public boolean filter(String value) throws Exception {
            return value.equals("5");
        }
    }).oneOrMore().greedy()
    .followedBy("end").where(new SimpleCondition<String>() {

        @Override
        public boolean filter(String value) throws Exception {
            return value.equals("6");
        }
    });

然而,(因此提供不同的起始匹配)Greedy运算符通过发出“3 5 5 5 5 6”按预期工作。

是否有可能让贪婪的匹配器在没有不同的起始模式的情况下抓住所有比赛?

或者我错过了什么?

斯蒂芬

2 个答案:

答案 0 :(得分:1)

感谢Chesnay Schepler的上述评论:

  

有一个关于贪婪匹配的知识错误,可以解释这种行为:issues.apache.org/jira/browse/FLINK-8914

我暂时还会注意到这一点。

答案 1 :(得分:0)

要控制将事件分配给多少个匹配项,您需要指定一个称为AfterMatchSkipStrategy的跳过策略。

使用Pattern.begin(“ start”,AfterMatchSkipStrategy.skipPastLastEvent())

final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

List<String> strings = Arrays.asList("1,3,5,5,5,5,6,".split(","));

DataStream<String> input = env.fromCollection(strings);

Pattern<String, ?> pattern = Pattern.<String>
        begin("start", AfterMatchSkipStrategy.skipPastLastEvent()).where(new SimpleCondition<String>() {
  @Override
  public boolean filter(String value) throws Exception {
    return value.equals("5");
  }
}).oneOrMore().greedy()
        .followedBy("end").where(new SimpleCondition<String>() {
          @Override
          public boolean filter(String value) throws Exception {
            return value.equals("6");
          }
        });

PatternStream<String> patternStream = CEP.pattern(input, pattern);

DataStream<String> result = patternStream.select(new PatternSelectFunction<String, String>() {
  @Override
  public String select(Map<String, List<String>> pattern) throws Exception {
    System.err.println("=======");
    pattern.values().forEach(match -> match.forEach(event -> System.err.println(event)));
    System.err.println("=======");
    return "-";
  }
});

result.print();
env.execute("Flink Streaming Java API Skeleton");