我希望匹配正则表达式以匹配可能不存在的单词。我读here我应该尝试这样的事情:
import re
line = "a little boy went to the small garden and ate an apple"
res = re.findall("a (little|big) (boy|girl) went to the (?=.*\bsmall\b) garden and ate a(n?)",line)
print res
但是输出是
[]
如果我将line
设置为
一个小男孩去了花园吃了一个苹果
如何允许在我的文本中存在或不存在可能的单词并在其存在时捕获它?
答案 0 :(得分:2)
首先,你需要匹配不仅是一个“小”字,而且还需要一个空格(或之前)。所以你可以像这样使用正则表达式:(small )?
。
另一方面,你只想捕捉这个词。要从匹配中排除匹配,您应该使用以下正则表达式:(?:(small) )?
完整示例:
import re
lines = [
'a little boy went to the small garden and ate an apple',
'a little boy went to the garden and ate an apple'
]
for line in lines:
res = re.findall(r'a (little|big) (boy|girl) went to the (?:(small) )?garden and ate a(n?)', line)
print res
输出:
[('little', 'boy', 'small', 'n')]
[('little', 'boy', '', 'n')]