我有一个正则表达式,但我希望将它们组合成单个表达式而不影响输出。下面的代码保存文本中的单词列表并保存到列表中。
import re
a=[]
with open('qwert.txt', 'r') as f:
for line in f:
res = re.findall(r'(?:Prof[.](\w+))', line)
if res:
a.extend(res)
res = re.findall(r'(?:As (\w+))', line)
if res:
a.extend(res)
res = re.findall(r'\w+(?==\w)', line)
if res:
a.extend(res)
print a
qwert.txt
As every
prof.John and Prof.Keel and goodthing=him
Prof.Tensa
Keel a good person As kim
kim is fine
Prof.Jees
As John winning Nobel prize
As Mary wins all prize
sa for ask
car
he=is good
输出:
['every', 'Keel', 'goodthing', 'Tensa', 'kim', 'Jees', 'John', 'Mary', 'he']
如何将三个正则表达式stmts放在一行?
答案 0 :(得分:0)
你可以使用运算符" |",它允许你找到一个或另一个表达式。
res = re.findall(r'(?:Prof[.](\w+))|(?:As (\w+))|(?:\w+(?==\w))', line)
答案 1 :(得分:0)
您需要将最后\w+
内部捕获组封闭,并且还需要启用多行修改器。
>>> import re
>>> a=[]
>>> with open('qwert.txt', 'r') as f:
... for line in f:
... res = re.findall(r'(?:Prof[.](\w+))|(?:As (\w+))|(\w+)(?==\w)', line, re.M)
... if res:
... a.extend(res)
...
>>> a
[('', 'every', ''), ('Keel', '', ''), ('', '', 'goodthing'), ('Tensa', '', ''), ('', 'kim', ''), ('Jees', '', ''), ('', 'John', ''), ('', 'Mary', ''), ('', '', 'he')]
或强>
没有任何捕获组,
>>> import re
>>> a=[]
>>> with open('qwert.txt', 'r') as f:
... for line in f:
... res = re.findall(r'(?<=Prof[.])\w+|(?<=As )\w+|\w+(?==\w)', line, re.M)
... if res:
... a.extend(res)
...
>>> a
['every', 'Keel', 'goodthing', 'Tensa', 'kim', 'Jees', 'John', 'Mary', 'he']