使用正则表达式在python中使用关键字组合来拉动句子

时间:2013-12-09 19:34:43

标签: python regex

假设我有字符串

'apples are red. this apple is green. pears are sometimes red, but not usually. pears are green. apples are yummy. lizards are green.' 

我希望使用正则表达式来拉动该字符串中的句子,首先提到苹果或梨,然后是颜色,红色或绿色。所以我基本上想要一个返回的列表:

["apples are red.", "this apple is green.", "pears are sometimes red, but not usually.", pears are green."]

我可以用苹果和梨或绿色和红色来表达正则表达式,例如

re.findall(r'([^.]*?apple[^.]*|[^.]*?pear[^.]*)', string) 

re.findall(r'([^.]*?red[^.]*|[^.]*?green[^.]*)', string) 

但是当我希望水果(苹果/梨)首先出现在字符串中后跟颜色和句子后面的某个点时,我怎么把这两个放在一起呢?

3 个答案:

答案 0 :(得分:0)

您可以使用parentheses对子表达式进行分组:

re.findall(r"[^.]*\b(?:apple|pear)[^.]*\b(?:red|green)\b[^.]*\.", string)

例如:

>>> import re
>>> a = 'apples are red. this apple is green. pears are sometimes red, but not usually. pears are green. apples are yummy. lizards are green.'
>>> re.findall(r"[^.]*\b(?:apple|pear)[^.]*\b(?:red|green)\b[^.]*\.", a)
['apples are red.', ' this apple is green.', 
 ' pears are sometimes red, but not usually.', ' pears are green.']

答案 1 :(得分:0)

使用此模式(?:^|\b)(?=[^.]*(?:apple|pear)[^.]*(?:red|green))([^.]+\.) Demo

答案 2 :(得分:0)

我建议你阅读NLTK(自然语言工具包)。它是用于文本处理的python包