错误地替换内容

时间:2014-12-21 09:46:50

标签: python regex

我试图用仅包含以下任何一个词的句子替换早午餐一词:星期六,星期日和/或周末。然而,它正在取代整个句子,而不仅仅是术语早午餐。

>>> reg = re.compile(r'(?:(?:^|\.)[^.]*(?=saturday|sunday|weekend)[^.]*(brunch)[^.]*(?:\$|\.)|(?:^|\.)[^.]*(brunch)[^.]*(?=saturday|sunday|weekend)[^.]*(?:\$|\.))',re.I)
>>> str = 'Limit 1 per person. Limit 1 per table. Not valid for carryout. Not valid 
    with any other offers, no cash back. Valid only for Wednesday-Friday dinner and 
    Saturday-Sunday brunch. Not valid on federal holidays. Reservation required.'
>>> reg.findall(str)
[('brunch', '')]
>>> reg.sub(r'BRUNCH',str)
'Limit 1 per person. Limit 1 per table. Not valid for carryout. Not valid with any 
 other offers, no cash backBRUNCH Not valid on federal holidays. Reservation required.'

我希望它产生以下内容:

Limit 1 per person. Limit 1 per table. Not valid for carryout. Not valid with any other
offers, no cash back. Valid only for Wednesday-Friday dinner and Saturday-Sunday BRUNCH. 
Not valid on federal holidays. Reservation required.

答案:

为了解决这个问题,我能够使用以下内容:

>>> reg = re.compile(r'(?:((?:^|\.)[^.]*(?=saturday|sunday|weekend)[^.]*)(brunch)([^.]*(?:\$|\.))|((?:^|\.)[^.]*)(brunch)([^.]*(?=saturday|sunday|weekend)[^.]*(?:\$|\.)))',re.I)
>>> reg.sub('\g<1>BRUNCH\g<3>',str)
'Limit 1 per person. Limit 1 per table. Not valid for carryout. Not valid with any other offers, no cash back. Valid only for Wednesday-Friday dinner and Saturday-Sunday BRUNCH. Not valid on federal holidays. Reservation required.'

4 个答案:

答案 0 :(得分:3)

不是使用正则表达式,而是将其分解为步骤更简单:

s = "Limit 1 per person. Limit 1 per table. Not valid for carryout. Not valid with any other offers, no cash back. Valid only for Wednesday-Friday dinner and Saturday-Sunday brunch. Not valid on federal holidays. Reservation required."
results = []
for line in s.split("."):
    if any(text in line.lower() for text in ("saturday", "sunday", "weekend")):
        results.append(line.replace("brunch", "BRUNCH"))
    else:
        results.append(line)
result = ".".join(results)
print(result)

答案 1 :(得分:1)

保持你的正则表达式像这样简单并使用替代参考:

reg = re.compile(r'((?:saturday|sunday|weekend)\s+)brunch', re.I)
reg.sub(r'\1BRUNCH',str)
'Limit 1 per person. Limit 1 per table. Not valid for carryout. Not valid with any other
 offers, no cash back. Valid only for Wednesday-Friday dinner and Saturday-Sunday BRUNCH.
 Not valid on federal holidays. Reservation required.'

答案 2 :(得分:1)

因为你被迫使用正则表达式:

搜索

((?:^|\.)(?=[^.]*(?:saturday|sunday|weekend))[^.]*)brunch

替换为

\1BRUNCH

确保将其编译为不区分大小写。请参阅demo

请注意,这只会替换每个句子brunch出现一次。

答案 3 :(得分:0)

您不必被迫使用regex,您可以分割您的句子并分别处理每个句子并改为使用列表理解:

>>> import re
>>> l=s.split('.')
>>> print '.'.join([re.sub('brunch','BRUNCH',sent) if 'Saturday' in sent or 'Sunday' in sent or 'Weekend' in sent else sent for sent in l])
'Limit 1 per person. Limit 1 per table. Not valid for carryout. Not valid 
    with any other offers, no cash back. Valid only for Wednesday-Friday dinner and 
    Saturday-Sunday BRUNCH. Not valid on federal holidays. Reservation required.'