根据我是否预编译正则表达式,我得到了不同的结果:
>>> re.compile('mr', re.IGNORECASE).sub('', 'Mr Bean')
' Bean'
>>> re.sub('mr', '', 'Mr Bean', re.IGNORECASE)
'Mr Bean'
Python documentation说有些函数是编译正则表达式的全功能方法的简化版本。但是它也声称RegexObject.sub()与相同sub()函数。
那么这里发生了什么?
答案 0 :(得分:12)
re.sub()
无法接受re.IGNORECASE
,它会出现。
文档说明:
sub(pattern, repl, string, count=0)
Return the string obtained by replacing the leftmost non-overlapping occurrences of the pattern in string by the replacement repl. repl can be either a string or a callable; if a string, backslash escapes in it are processed. If it is a callable, it's passed the match object and must return a replacement string to be used.
然而,使用它可以起到作用:
re.sub("(?i)mr", "", "Mr Bean")
答案 1 :(得分:5)
模块级sub()调用最后不接受修饰符。这就是“count”参数 - 要替换的模式最大出现次数。
答案 2 :(得分:4)
>>> help(re.sub)
1 Help on function sub in module re:
2
3 sub(pattern, repl, string, count=0)
4 Return the string obtained by replacing the leftmost
5 non-overlapping occurrences of the pattern in string by the
6 replacement repl. repl can be either a string or a callable;
7 if a callable, it's passed the match object and must return
8 a replacement string to be used.
re.sub
中的正则表达式标记(IGNORECASE, MULTILINE, DOTALL
)中没有与re.compile
中一样的函数参数。
备选方案:
>>> re.sub("[M|m]r", "", "Mr Bean")
' Bean'
>>> re.sub("(?i)mr", "", "Mr Bean")
' Bean'
编辑 Python 3.1增加了对正则表达式标志http://docs.python.org/3.1/whatsnew/3.1.html的支持。自3.1起签名,例如re.sub
看起来像:
re.sub(pattern, repl, string[, count, flags])
答案 3 :(得分:2)
从Python 2.6.4文档:
re.sub(pattern, repl, string[, count])
re.sub()不带标志来设置正则表达式模式。如果需要re.IGNORECASE,则必须使用re.compile()。sub()