pythonic字符串语法校正器

时间:2014-01-14 21:18:42

标签: python

我编写了一个脚本,用于在解析器读取命令之前捕获并更正命令。解析器要求用逗号分隔相等,不等于,等等的条目,例如:

'test(a> = b)'错误 'test(a,> =,b)'是正确的

我写的脚本运行正常,但我很想知道是否有更有效的方法来做到这一点。

这是我的剧本:

# Correction routine
def corrector(exp):
    def rep(exp,a,b):
        foo = ''
        while(True):
            foo = exp.replace(a,b)
            if foo == exp:
                return exp
            exp = foo

    # Replace all instances with a unique identifier. Do it in a specific order
    # so for example we catch an instance of '>=' before we get to '='
    items = ['>=','<=','!=','==','>','<','=']
    for i in range(len(items)):
        exp = rep(exp,items[i],'###%s###'%i)

    # Re-add items with commas
    for i in range(len(items)):
        exp = exp.replace('###%s###'%i,',%s,'%items[i])

    # Remove accidental double commas we may have added
    return exp.replace(',,',',')


print corrector('wrong_syntax(b>=c) correct_syntax(b,>=,c)')
// RESULT: wrong_syntax(b,>=,c) correct_syntax(b,>=,c)

谢谢!

2 个答案:

答案 0 :(得分:3)

正如评论中所提到的,一种方法是使用正则表达式。当以下正则表达式未被逗号包围时,它们会与您的任何运算符匹配,并使用插入逗号的相同字符串替换它们:

inputstring = 'wrong_syntax(b>=c) correct_syntax(b,>=,c)'
regex = r"([^,])(>=|<=|!=|==|>|<|=)([^,])"
replace = r"\1,\2,\3"

result = re.sub(regex, replace, inputstring)

print(result)

简单的正则表达式相对容易,但它们很快就会变得复杂。查看文档以获取更多信息:

http://docs.python.org/2/library/re.html

答案 1 :(得分:1)

这是一个正如你所问的那样的正则表达式:

import re
regex = re.compile(r'''

    (?<!,)                  # Negative lookbehind
    (!=|[><=]=?)
    (?!,)                   # Negative lookahead

''', re.VERBOSE)
print regex.sub(r',\1,', 'wrong_expression(b>=c) or right_expression(b,>=,c)')

输出

wrong_expression(b,>=,c) or right_expression(b,>=,c)