Question

我想在python表达式中添加一个可选部分：

myExp = re.compile("(.*)_(\d+)\.(\w+)")

这样如果我的字符串是abc_34.txt，则result.group（2）是34 如果我的字符串是abc_2034.txt，results.group（2）仍然是34

我尝试了myExp = re.compile("(.*)_[20](\d+)\.(\w+)")

但我的results.groups（2）对于abc_2034.txt的情况是034

谢谢F.J。

但我想扩展您的解决方案并添加后缀。

所以如果我把abc_203422.txt，results.group（2）仍然是34

我试过“（。*）_（？：20）？（\ d +）（？：22）?.（\ w +）”）但我得到3422而不是34

Answer 1

strings = [
    "abc_34.txt", 
    "abc_2034.txt",  
]


for string in strings:
    first_part, ext = string.split(".")
    prefix, number = first_part.split("_")

    print prefix, number[-2:], ext


--output:--
abc 34 txt
abc 34 txt



import re

strings = [
    "abc_34.txt", 
    "abc_2034.txt",  
]

pattern = r"""
    ([^_]*)     #Match not an underscore, 0 or more times, captured in group 1
    _           #followed by an underscore
    \d*         #followed by a digit, 0 or more times, greedy
    (\d{2})     #followed by a digit, twice, captured in group 2
    [.]         #followed by a period
    (.*)        #followed by any character, 0 or more times, captured in group 3
"""


regex = re.compile(pattern, flags=re.X)  #ignore whitespace and comments in regex

for string in strings:
    md = re.match(regex, string)
    if md:
        print md.group(1), md.group(2), md.group(3)

--output:--
abc 34 txt
abc 34 txt

Answer 2

myExp = re.compile("(.*)_(?:20)?(\d+)\.(\w+)")

包含?:的组开头的20使其成为非捕获组，该组之后的?使其成为可选组。因此(?:20)?表示“可选地匹配20”。

Answer 3

不确定您是否正在寻找此项，但?是0或1次的重新符号。或{0,2}最多两个可选[0-9]有点hacky。我会更多地考虑它。

在python正则表达式中添加可选部分

3 个答案: