正则表达式:<word> <capture group =“”> <specific words =“”但=“”in =“”any =“”order =“”> <capture group =“”>在python中

时间:2016-09-22 03:39:09

标签: regex python-2.7

下面的代码片段有效,但我不喜欢regexp中的(?:terminate.*instance|instance.*terminate)部分,因为它看起来像代码重复。它变得特别难看,当在中间我们可以有3个或更多特定的必需单词,但是以任何顺序。

是否可以避免代码重复?任何帮助表示赞赏。

regex = "^ec2 (?P<environment>\S+) (?:terminate.*instance|instance.*terminate) (?P<instance_id>i\-\S+) (?P<reason>.*)"

commands_positive = [
    "ec2 production instance terminate i-ab87cd98bfg this is the reason"
    "ec2 development terminate instance i-abcd12bcdg reason"
    ]

commands_negative = [
    "ec2 production instance falsecommand i-ab767cdc reason",
    "ec2 testing instance terminate i-abcdfgg",
    "ec2 development terminate instance i-abcd8733"
    ]

    for command in commands_positive:
        self.assertRegexpMatches(command, regex)

    for command in commands_negative:
        self.assertNotRegexpMatches(command, regex)

1 个答案:

答案 0 :(得分:0)

如果您想按任意顺序匹配字词,请使用lookahead

(?=.*instance)(?=.*terminate).*

这将匹配除字符串

以外的所有内容
ec2 production instance falsecommand i-ab767cdc reason

减少你的重复