s = "LEV606 (P), LEV230 (P)"
#Expected result: ['LEV606', 'LEV230']
# First attempt
In [3]: re.findall(r"[A-Z]{3}[0-9]{3}[ \(P\)]?", s)
Out[3]: ['LEV606 ', 'LEV230 ']
# Second attempt. The 'P' is not mandatory, can be other letter.
# Why this doesn't work?
In [4]: re.findall(r"[A-Z]{3}[0-9]{3}[ \([A-Z]{1}\)]?", s)
Out[4]: []
# Third attempt
# White space is still there. Why? I want to remove it from the answer
In [5]: re.findall(r"[A-Z]{3}[0-9]{3}[\s\(\w\)]?", s)
Out[5]: ['LEV606 ', 'LEV230 ']
答案 0 :(得分:0)
您正在使用[...]
语法错误;这是一个字符类,一个可以匹配的 set 字符。该类中列出的任何一个字符都匹配,因此可以是空格,(
字符,P
或)
;那个空间会做得很好。
使用非捕获组而不是字符类使额外文本可选,并使用您想要的部分的捕获组:
re.findall(r"([A-Z]{3}[0-9]{3})(?: \(P\))?", s)
演示:
>>> import re
>>> s = "LEV606 (P), LEV230 (P)"
>>> re.findall(r"([A-Z]{3}[0-9]{3})(?: \(P\))?", s)
['LEV606', 'LEV230']