如何使捕获组(。+?)在Cucumber Java中按预期工作?

时间:2018-06-21 03:55:45

标签: regex cucumber cucumber-jvm gherkin

当我为功能文件编写黄瓜Java步骤时,我正在观察以下内容:

功能文件:

Then I get result one <result1> and result two <result2> from microservice

java步骤(stepdefinition)

@Then("^I get result one(.+?) and result two(.+?)$")  //step function 1
public void i_get_result_one_and_result_two(String result1, String result2)
        throws Throwable {}

@Then("^I get result one(.+?) and result two(.+?) from microservice$")  //step function 2
public void i_get_result_one_and_result_two_from_ms(String result1, String result2)
        throws Throwable {}

以某种方式,功能文件始终映射到步进功能1,而从不映射到步进功能2

据我所知,捕获组(。+?)的定义与任何1个或多个匹配(我假定仅与功能文件中的变量匹配)。我不明白为什么它与步进功能2不匹配。

有人可以解释为什么会发生这种情况,以及如何解决此问题吗?

非常感谢。

2 个答案:

答案 0 :(得分:1)

问题解释

在这里使用正则表达式来匹配您的步骤时,应注意.+尽可能匹配任意字符(至少匹配1个字符)。

这本身意味着您执行的步骤:

^I get result one (.+?) and result two (.+?)$

matching everything,从最后一个捕获组开始。

答案

如果要使捕获组仅匹配引号中的内容,则应使用:

^I get result one '([^']+?)' and result two '([^']+?)'$

在这里,[^']+意味着尽可能多地匹配不是单引号/撇号的任何字符(至少匹配1个字符)

(您也可以使用双引号代替单引号)

答案 1 :(得分:0)

您是正确的,模式(.+?)与一组任何字符都匹配,出现一次或多次(使用reluctant quantifier)。该组的结尾是字符串的结尾(部分$的{​​{1}})。

模式(.+?)$与两个字符串都匹配。我将括号放在匹配的部分周围。

^I get result one(.+?) and result two(.+?)$

您可以重新命名您的步骤,以使模式不会匹配两个句子,也可以在步骤中围绕变量字段,例如用单引号(它必须是一个不会出现在匹配值中的字符) ,并分别修改模式。

看起来像

^I get result one( <result1>) and result two( <result2> from microservice)$
^I get result one( <result1>) and result two( <result2>)$

编辑,此处提供了一些更详细的说明,说明匹配的工作原理。

首先说明了模式// steps Then I get result one '<result1>' and result two '<result2>' from microservice Then I get result one '<result1>' and result two '<result2>' // glue code @Then("^I get result one '(.+?)' and result two '(.+?)' from microservice$") @Then("^I get result one '(.+?)' and result two '(.+?)'$") (.+?)([^']+?)限定词表示搜索从左到右吃掉了这些字符(请参阅链接)。

^我得到结果1'(。+?)'和结果2'(。+?)'$

?

^ --- begin of the line I get result one ' --- a fixed sequence (.+?) --- any character, one or more times (group 1) ' and result two ' --- a fixed sequence (.+?) --- any character, one or more times (group 2) ' --- a fixed sequence $ --- end of the line group 1可以包含任何字符,包括group 2

^我得到结果1'([[^'] +?)'和结果2'([^'] +?)'$

'

^ --- begin of the line I get result one ' --- a fixed sequence ([^']+?) --- any character, except the single quote, one or more times (group 1) ' and result two ' --- a fixed sequence ([^']+?) --- any character, except the single quote, one or more times (group 2) ' --- a fixed sequence $ --- end of the line group 1一旦包含group 2,该行将不再匹配。
例如'
因为I get result one '<O'Reilly>' and result two '<result2>'将是group 1,然后该模式期望固定序列<O' and result two '不匹配。

一些演示片段

'Reilly>' ...

输出

Pattern pattern = Pattern.compile("^I get result one '(.+?)' and result two '(.+?)'$");
Matcher matcher = pattern.matcher("I get result one '<result1>' and result two '<result>'");
while (matcher.find()) {
    for (int i = 0; i <= matcher.groupCount(); i++) {
        System.out.printf("group: %d  subsequence: %s%n", i, matcher.group(i));
    }
}

group: 0 subsequence: I get result one '<result1>' and result two '<result2>' group: 1 subsequence: <result1> group: 2 subsequence: <result2> 被整个表达式捕获

group 0

输出

Pattern pattern = Pattern.compile("^I get result one '(.+?)' and result two '(.+?)'$");
Matcher matcher = pattern.matcher("I get result one '<O'Reilly>' and result two '<result2>'");

group: 0 subsequence: I get result one '<O'Reilly>' and result two '<result2>' group: 1 subsequence: <O'Reilly> group: 2 subsequence: <result2> 也与group 1匹配,因为'嵌入在前后的固定序列之间。

现在是排除周围字符的图案。

(.+?)

输出

Pattern pattern = Pattern.compile("^I get result one '([^']+?)' and result two '([^']+?)'$");
Matcher matcher = pattern.matcher("I get result one '<result1>' and result two '<result2>'");

模式group: 0 subsequence: I get result one '<result1>' and result two '<result2>' group: 1 subsequence: <result1> group: 2 subsequence: <result2> 没有区别,因为应该由(.+?)group 1捕获的值不包含group 2

'

没有输出,因为模式与行不匹配(请参见上面的说明)。这也意味着黄瓜将无法找到相关的胶合方法。

假设步骤在功能文件中定义为

Pattern pattern = Pattern.compile("^I get result one '([^']+?)' and result two '([^']+?)'$");
Matcher matcher = pattern.matcher("I get result one '<O'Reilly>' and result two '<result2>'");

并且胶水方法用

注释
Then I get result one '<O'Reilly>' and result two '<result2>'

运行黄瓜会引发以下异常

@Then("^I get result one '([^']+?)' and result two '([^']+?)'$")