Python 3正则表达式问题

时间:2014-07-19 17:07:57

标签: python regex escaping character

所以我在Python中匹配正则表达式字符串时遇到问题。我在http://regex101.com/上测试了它,它运行正常。但是,当我尝试在我的代码中执行此操作时,它会给我一个格式错误的正则表达式错误

正则表达式为:“[^ \\] \] PW \ [”。我打算做的是我希望它找到我的字符串] PW [,只要它不是以它前面的反斜杠开头。 这是代码:

import sys,re
fileList = []
if len(sys.argv) == (0 or 1):
    fileList = ['tester.sgf']
else:
    fileList = str(sys.argv)
for sgfName in fileList:
    print(sgfName)
    currentSGF = open(sgfName,'r').read()
    currentSGF = currentSGF.replace("\r","") #clean the string
    currentSGF = currentSGF.replace("\n","")
for iterations in re.finditer("[^\\]\]PW\[",currentSGF): #here's the issue
    print(iterations.start(0), iterations.end(0), iterations.group())

我得到的错误是:

Traceback (most recent call last):
File "C:\Users\Josh\Desktop\New folder\sgflib1.0\test2.py", line 15, in <module>
for iterations in re.finditer("[^\\]\]PW\[",currentSGF):
File "C:\Python33\lib\re.py", line 210, in finditer
  return _compile(pattern, flags).finditer(string)
File "C:\Python33\lib\re.py", line 281, in _compile
  p = sre_compile.compile(pattern, flags)
File "C:\Python33\lib\sre_compile.py", line 491, in compile
  p = sre_parse.parse(p, flags)
File "C:\Python33\lib\sre_parse.py", line 747, in parse
  p = _parse_sub(source, pattern, 0)
File "C:\Python33\lib\sre_parse.py", line 359, in _parse_sub
  itemsappend(_parse(source, state))
File "C:\Python33\lib\sre_parse.py", line 485, in _parse
  raise error("unexpected end of regular expression")
sre_constants.error: unexpected end of regular expression

感谢您的帮助!

1 个答案:

答案 0 :(得分:1)

您需要使用原始字符串文字或双倍所有转义符:

re.finditer(r"[^\\]\]PW\[", currentSGF)

re.finditer("[^\\\\]\\]PW\\[", currentSGF)

否则每个转义序列首先由Python解释为文字字符串值解释的一部分。 re.finditer会看到值'[^\]]PW[,因为\]\[没有特殊含义。

请参阅Python正则表达式HOWTO中的The Backslash Plague