正则表达式匹配a或b模式

时间:2014-01-07 11:26:46

标签: python regex

我在python中使用正则表达式库时遇到了一个小问题,特别是匹配方法有不同的模式:

import re
files = ["noi100k_0p55m0p3_fow71f",\
     "fnoi100v5_71f60s",\
     "noi100k_0p55m0p3_151f_560s",\
     "noi110v25_560s"]

for i in files:
    keyws = i.split("_")
    for j in keyws:
        if re.match(r"noi(\w+)k|fnoi(\w+)v(\w+)|noi(\w+)v(\w+)",j): 
            print "Results :", re.match(r"noi(\w+)k|fnoi(\w+)v(\w+)|noi(\w+)v(\w+)",j).group(1)

结果是:

Results : 100
Results : None
Results : 100
Results : None

我期待的时候:

Results : 100
Results : 100
Results : 100
Results : 110

唯一的匹配是"noi(\w+)k"它似乎不会测试其他模式,但re.match(a|b,string)应检查ab模式否?

1 个答案:

答案 0 :(得分:1)

您的群组从左到右编号;如果一个的替代品匹配,则您需要提取的

您有5个组,组1或组2和3,或组4和组5将包含匹配项:

for j in keyws:
    match = re.match(r"noi(\w+)k|fnoi(\w+)v(\w+)|noi(\w+)v(\w+)",j)
    if match: 
        results = match.group(1) or match.group(2) or match.group(4)
        print "Results :", results

会在每个替代方案中打印第一个匹配的\w+组。

演示:

>>> import re
>>> files = ["noi100k_0p55m0p3_fow71f",\
...      "fnoi100v5_71f60s",\
...      "noi100k_0p55m0p3_151f_560s",\
...      "noi110v25_560s"]
>>> for i in files:
...     keyws = i.split("_")
...     for j in keyws:
...         match = re.match(r"noi(\w+)k|fnoi(\w+)v(\w+)|noi(\w+)v(\w+)",j)
...         if match: 
...             results = match.group(1) or match.group(2) or match.group(4)
...             print "Results :", results
... 
Results : 100
Results : 100
Results : 100
Results : 110

如果您不打算使用其他两个捕获的(\w+)组,请删除括号,以便更轻松地选择匹配的组:

match = re.match(r"noi(\w+)k|fnoi(\w+)v\w+|noi(\w+)v\w+",j)
if match: 
    results = next(g for g in match.groups() if g)
    print "Results :", results

选择第一个非空的匹配组。

如果您接受fnoi(\w+)k,也可以进一步简化您的模式:

match = re.match(r"f?noi(\w+)[kv](\w*)", j)

此时只有.group(1)