Python正则表达式不一致

时间:2009-12-21 23:54:20

标签: python regex

根据我是否预编译正则表达式,我得到了不同的结果:

>>> re.compile('mr', re.IGNORECASE).sub('', 'Mr Bean')
' Bean'
>>> re.sub('mr', '', 'Mr Bean', re.IGNORECASE)
'Mr Bean'

Python documentation有些函数是编译正则表达式的全功能方法的简化版本。但是它也声称RegexObject.sub()相同sub()函数

那么这里发生了什么?

4 个答案:

答案 0 :(得分:12)

re.sub()无法接受re.IGNORECASE,它会出现。

文档说明:

  

sub(pattern, repl, string, count=0)

Return the string obtained by replacing the leftmost
non-overlapping occurrences of the pattern in string by the
replacement repl.  repl can be either a string or a callable;
if a string, backslash escapes in it are processed.  If it is
a callable, it's passed the match object and must return
a replacement string to be used.

然而,使用它可以起到作用:

re.sub("(?i)mr", "", "Mr Bean")

答案 1 :(得分:5)

模块级sub()调用最后不接受修饰符。这就是“count”参数 - 要替换​​的模式最大出现次数。

答案 2 :(得分:4)

>>> help(re.sub)
  1 Help on function sub in module re:
  2 
  3 sub(pattern, repl, string, count=0)
  4     Return the string obtained by replacing the leftmost
  5     non-overlapping occurrences of the pattern in string by the
  6     replacement repl.  repl can be either a string or a callable;
  7     if a callable, it's passed the match object and must return
  8     a replacement string to be used.

re.sub中的正则表达式标记(IGNORECASE, MULTILINE, DOTALL)中没有与re.compile中一样的函数参数。

备选方案:

>>> re.sub("[M|m]r", "", "Mr Bean")
' Bean'

>>> re.sub("(?i)mr", "", "Mr Bean")
' Bean'

编辑 Python 3.1增加了对正则表达式标志http://docs.python.org/3.1/whatsnew/3.1.html的支持。自3.1起签名,例如re.sub看起来像:

re.sub(pattern, repl, string[, count, flags])

答案 3 :(得分:2)

从Python 2.6.4文档:

re.sub(pattern, repl, string[, count])

re.sub()不带标志来设置正则表达式模式。如果需要re.IGNORECASE,则必须使用re.compile()。sub()