Question

我正在尝试使用REGEX来解析小黄瓜文件。我已经成功地将它分成多个REGEX来完成我想要的，但是我在获取其中一个值的最后一个实例时遇到了问题。

(?s)(?P<Then>Then.*?)Scenario:|$返回除

之外的所有实例

# This is a sample .feature file
Feature: Authorized Logins

Users will have individual logins, gated by passwords.


Scenario: Invalid login credentials are entered

    Given Default Database is loaded
    And App is available and running
    When User enters invalid login
    Then Application should deny login

Scenario: Valid login credentials are entered

    Given Default Database is loaded
    And App is available and running
    When User enters valid login
    And display test case selection screen
    Then Application should grant login
    And display test case selection screen

Scenario: No database connection
    Given Database is stopped
    And App is available and running
    When User enters valid login
    Then Application will deny login
    And inform of no connection

最后一次

Then Application will deny login
    And inform of no connection

未被选中。我尝试了各种各样的东西，但似乎无法得到它。有什么建议吗？

https://regex101.com/r/iiM5X5/2

Answer 1

简

现在你的正则表达式是：匹配(?P<Then>Then.*?)Scenario:或$，你需要将所有选项组合在一起，如下所示。

代码

See regex in use here

(?P<Then>Then.*?)(?:Scenario:|$)

您也可以使用s修饰符（re.DOTALL），而不是将其作为(?s)放在正则表达式中。

Answer 2

在解析数据时，正则表达式是一个非常诱人的工具。但是，一些数据具有预定义的结构和格式规则。而且，最重要的是，现有的解析器。

在您的情况下，这些是BDD Gherkin feature文件。 behave library已有一个解析器 - 安装behave，导入并使用它，示例：

from behave.parser import Parser

filename = "test.feature"
with open(filename) as f:
    feature = Parser().parse(f.read(), filename)

    print(feature.name)
    for scenario in feature.scenarios:
        print(scenario.name)
        print("-----")
        for step in scenario.steps:
            print(step.keyword, step.name)
        print("--------")

正则表达式查找字符串

2 个答案:

简

代码