我有这样的字符串:
text = "Why do Humans need to eat food? Humans eat food to survive."
我想捕获Human
和food
之间的所有内容,但只有第一次。
预期产量
Humans need to eat food
我的正则表达式:
p =r'(\bHumans?\b.*?\bFoods?\b)'
Python代码:
re.findall(p, text, re.I|re.M|re.DOTALL)
该代码可以正确捕获人与食物之间的字符串,但不会在第一次捕获时停止。
研究:
我已经读到要使其不贪婪,我需要放入?
,但我无法弄清楚应该将其保留为不贪婪的位置。我尝试过的所有其他排列组合都无法在第一局就停止。
更新
我正在编写很多正则表达式来捕获像这样的各种其他实体,并一次性解析它们,因此我无法更改re.findall
逻辑。
答案 0 :(得分:5)
使用search
代替findall
:
import re
text = "Why do Humans need to eat food? Humans eat food to survive."
p =r'(\bHumans?\b.*?\bFoods?\b)'
res = re.search(p, text, re.I|re.M|re.DOTALL)
print(res.groups())
输出:
('Humans need to eat food',)
或在正则表达式的末尾添加.*
:
import re
text = "Why do Humans need to eat food? Humans eat food to survive."
p =r'(\bHumans?\b.*?\bFoods?\b).*'
# here ___^^
res = re.findall(p, text, re.I|re.M|re.DOTALL)
print(res)
答案 1 :(得分:3)
对于仅查找第一个匹配项,Toto的答案是最好的,但是正如您所说的,您仅需要使用findall
,您只需在正则表达式的末尾附加.*
即可匹配赢得不会再有任何匹配结果。
(\bHumans?\b.*?\bFoods?\b).*
^^ This eats remaining part of your text due to which there won't be any further matches.
示例Python代码,
import re
text = "Why do Humans need to eat food? Humans eat food to survive."
p =r'(\bHumans?\b.*?\bFoods?\b).*'
print(re.findall(p, text, re.I|re.M|re.DOTALL))
打印
['Humans need to eat food']
答案 2 :(得分:1)
尝试一下:
>>> import re
>>> text = "Why do Humans need to eat food? Humans eat food to survive."
>>> re.search(r'Humans.*?food', text).group() # you want the all powerful non-greedy '?' :)
'Humans need to eat food'