用pyparsing检测无效语法

时间:2017-01-02 09:42:09

标签: python pyparsing

我有一个简单的语法来评估逻辑表达式。

"单身"条件属于

类型
keyword = value
keyword != value

然后,我想允许与括号分组的逻辑组合, e.g。

cond1 & cond2 & cond3
cond1 & ( cond2 | cond3 )
(cond1 & cond2) | cond3
cond1 | cond2 | cond3

但不是左/右相关性很重要的逻辑组合 并混淆用户。所以不允许以下内容:

cond1 & cond2 | cond3

如果我理解正确,后一条要求阻止我使用operatorPrecedence(), 而且无论如何我想在较低的层次上理解它。

以下语法几乎把我带到了我想要的地方:

from pyparsing import Word, alphas, nums, oneOf, Literal, Group, Suppress, \
                        Forward, ZeroOrMore, ParseException, StringEnd
comparison_op_list = ['=', '!=']
logical_op_list = ['&', '|']
# Pyparsing expression for single conditions
keyword = Word(alphas + nums + '_.:-')
comparison_op = oneOf(comparison_op_list)
value = Word(alphas + nums + '_.:-;')
single_condition = keyword + comparison_op + value
# Pyparsing expression for combined condition
lpar = Literal( '(' )
rpar = Literal( ')' )
logical_op = oneOf(logical_op_list)
combined_expr = Forward()
atom = single_condition | ( Suppress(lpar) + combined_expr + Suppress(rpar) )
combined_expr << Group(atom) + ZeroOrMore( logical_op + combined_expr )

# Examples
test_strings = [
    'keyword = value',
    'keyword1 = value1 & keyword2 = value2',
    'a=a & (b=b|c=c) & (d=d & (e=e|f=f))',
    'a=a & ((b=b|(c=c))) & (((d=d) & (e=e|f=f)))',
    'test1=A & test2=B | test3 = C'  # Parses fine, but later rejected in python code
]
for s in test_strings:
    print
    print s
    print combined_expr.parseString(s)

输出:

keyword = value
[['keyword', '=', 'value']]

keyword1 = value1 & keyword2 = value2
[['keyword1', '=', 'value1'], '&', ['keyword2', '=', 'value2']]

a=a & (b=b|c=c) & (d=d & (e=e|f=f))
[['a', '=', 'a'], '&', [['b', '=', 'b'], '|', ['c', '=', 'c']], '&', [['d', '=', 'd'], '&', [['e', '=', 'e'], '|', ['f', '=', 'f']]]]

a=a & ((b=b|(c=c))) & (((d=d) & (e=e|f=f)))
[['a', '=', 'a'], '&', [[['b', '=', 'b'], '|', [['c', '=', 'c']]]], '&', [[[['d', '=', 'd']], '&', [['e', '=', 'e'], '|', ['f', '=', 'f']]]]]

test1=A & test2=B | test3 = C
[['test1', '=', 'A'], '&', ['test2', '=', 'B'], '|', ['test3', '=', 'C']]

从这里我可以处理常规python中的有效输入。 问题是一些无效的语法也被读取而没有错误:

invalid_strings = [
    'a b = c',  # Rightfully rejected 
    'word1 = word2 != word3', # Read as 'word1 = word2'
    'keyword = value_with_*illegal*_characters' # Read as 'keyword = value_with_'
]
for s in invalid_strings:
    try:
        result = combined_expr.parseString(s)
    except ParseException:
        result = None
    print
    print s
    print result

输出:

a b = c
None

word1 = word2 != word3
[['word1', '=', 'word2']]

keyword = value_with_*illegal*_characters
[['keyword', '=', 'value_with_']]

对于没有逻辑组合的上述例子,我很可能 使用StringEnd要求字符串在表达式后立即结束。 但是当条件与逻辑运算符结合时,这不起作用(?) 有没有办法要求输入表达式的所有部分都被识别为 语法的一部分,或者这是否属于pyparsing的范围和任务 例如PLY?

0 个答案:

没有答案