我编写了一个脚本,用于在解析器读取命令之前捕获并更正命令。解析器要求用逗号分隔相等,不等于,等等的条目,例如:
'test(a> = b)'错误 'test(a,> =,b)'是正确的
我写的脚本运行正常,但我很想知道是否有更有效的方法来做到这一点。
这是我的剧本:
# Correction routine
def corrector(exp):
def rep(exp,a,b):
foo = ''
while(True):
foo = exp.replace(a,b)
if foo == exp:
return exp
exp = foo
# Replace all instances with a unique identifier. Do it in a specific order
# so for example we catch an instance of '>=' before we get to '='
items = ['>=','<=','!=','==','>','<','=']
for i in range(len(items)):
exp = rep(exp,items[i],'###%s###'%i)
# Re-add items with commas
for i in range(len(items)):
exp = exp.replace('###%s###'%i,',%s,'%items[i])
# Remove accidental double commas we may have added
return exp.replace(',,',',')
print corrector('wrong_syntax(b>=c) correct_syntax(b,>=,c)')
// RESULT: wrong_syntax(b,>=,c) correct_syntax(b,>=,c)
谢谢!
答案 0 :(得分:3)
正如评论中所提到的,一种方法是使用正则表达式。当以下正则表达式未被逗号包围时,它们会与您的任何运算符匹配,并使用插入逗号的相同字符串替换它们:
inputstring = 'wrong_syntax(b>=c) correct_syntax(b,>=,c)'
regex = r"([^,])(>=|<=|!=|==|>|<|=)([^,])"
replace = r"\1,\2,\3"
result = re.sub(regex, replace, inputstring)
print(result)
简单的正则表达式相对容易,但它们很快就会变得复杂。查看文档以获取更多信息:
答案 1 :(得分:1)
这是一个正如你所问的那样的正则表达式:
import re
regex = re.compile(r'''
(?<!,) # Negative lookbehind
(!=|[><=]=?)
(?!,) # Negative lookahead
''', re.VERBOSE)
print regex.sub(r',\1,', 'wrong_expression(b>=c) or right_expression(b,>=,c)')
输出
wrong_expression(b,>=,c) or right_expression(b,>=,c)