所以我在Python中匹配正则表达式字符串时遇到问题。我在http://regex101.com/上测试了它,它运行正常。但是,当我尝试在我的代码中执行此操作时,它会给我一个格式错误的正则表达式错误
正则表达式为:“[^ \\] \] PW \ [”。我打算做的是我希望它找到我的字符串] PW [,只要它不是以它前面的反斜杠开头。 这是代码:
import sys,re
fileList = []
if len(sys.argv) == (0 or 1):
fileList = ['tester.sgf']
else:
fileList = str(sys.argv)
for sgfName in fileList:
print(sgfName)
currentSGF = open(sgfName,'r').read()
currentSGF = currentSGF.replace("\r","") #clean the string
currentSGF = currentSGF.replace("\n","")
for iterations in re.finditer("[^\\]\]PW\[",currentSGF): #here's the issue
print(iterations.start(0), iterations.end(0), iterations.group())
我得到的错误是:
Traceback (most recent call last):
File "C:\Users\Josh\Desktop\New folder\sgflib1.0\test2.py", line 15, in <module>
for iterations in re.finditer("[^\\]\]PW\[",currentSGF):
File "C:\Python33\lib\re.py", line 210, in finditer
return _compile(pattern, flags).finditer(string)
File "C:\Python33\lib\re.py", line 281, in _compile
p = sre_compile.compile(pattern, flags)
File "C:\Python33\lib\sre_compile.py", line 491, in compile
p = sre_parse.parse(p, flags)
File "C:\Python33\lib\sre_parse.py", line 747, in parse
p = _parse_sub(source, pattern, 0)
File "C:\Python33\lib\sre_parse.py", line 359, in _parse_sub
itemsappend(_parse(source, state))
File "C:\Python33\lib\sre_parse.py", line 485, in _parse
raise error("unexpected end of regular expression")
sre_constants.error: unexpected end of regular expression
感谢您的帮助!
答案 0 :(得分:1)
您需要使用原始字符串文字或双倍所有转义符:
re.finditer(r"[^\\]\]PW\[", currentSGF)
或
re.finditer("[^\\\\]\\]PW\\[", currentSGF)
否则每个转义序列首先由Python解释为文字字符串值解释的一部分。 re.finditer
会看到值'[^\]]PW[
,因为\]
和\[
没有特殊含义。
请参阅Python正则表达式HOWTO中的The Backslash Plague。