如何让pyparser以特定的形式工作

时间:2012-10-25 09:01:15

标签: python parsing

抱歉抱歉。我想不出更好的事情

我正在尝试实施具有以下要求的pyparsing的DSL:

  1. varaibles:所有这些都以v _
  2. 开头
  3. 一元运算符:+, -
  4. 二元运算符:+, - ,*,/,%
  5. 常数
  6. 功能,就像普通功能只有一个变量一样
  7. 功能需要像这样工作:foo(v_1+v_2) = foo(v_1) + foo(v_2)foo(bar(10*v_6))=foo(bar(10))*foo(bar(v_6))。任何二元操作都应该是这种情况
  8. 我能够1-5工作

    这是我到目前为止的代码

    from pyparsing import *
    
    exprstack = []
    
    #~ @traceParseAction
    def pushFirst(tokens):
        exprstack.insert(0,tokens[0])
    
    # define grammar
    point = Literal( '.' )
    plusorminus = Literal( '+' ) | Literal( '-' )
    number = Word( nums )
    integer = Combine( Optional( plusorminus ) + number )
    floatnumber = Combine( integer +
                           Optional( point + Optional( number ) ) +
                           Optional( integer )
                         )
    
    ident = Combine("v_" + Word(nums))
    
    plus  = Literal( "+" )
    minus = Literal( "-" )
    mult  = Literal( "*" )
    div   = Literal( "/" )
    cent   = Literal( "%" )
    lpar  = Literal( "(" ).suppress()
    rpar  = Literal( ")" ).suppress()
    addop  = plus | minus
    multop = mult | div | cent
    expop = Literal( "^" )
    band = Literal( "@" )
    
    # define expr as Forward, since we will reference it in atom
    expr = Forward()
    fn = Word( alphas )
    atom = ( ( floatnumber | integer | ident | ( fn + lpar + expr + rpar ) ).setParseAction(pushFirst) |
             ( lpar + expr.suppress() + rpar ))
    
    factor = Forward()
    factor << atom + ( ( band + factor ).setParseAction( pushFirst ) | ZeroOrMore( ( expop + factor ).setParseAction( pushFirst ) ) )
    
    term = factor + ZeroOrMore( ( multop + factor ).setParseAction( pushFirst ) )
    expr << term + ZeroOrMore( ( addop + term ).setParseAction( pushFirst ) )
    print(expr)
    bnf = expr
    
    pattern =  bnf + StringEnd()
    
    
    def test(s):
        del exprstack[:]
        bnf.parseString(s,parseAll=True)
        print exprstack
    
    test("avg(+10)")
    test("v_1+8")
    test("avg(v_1+10)+10")
    

    这就是我想要的。

    我的功能属于这种类型:

    foo(v_1+v_2) = foo(v_1) + foo(v_2)
    

    同样的行为也适用于任何其他二进制操作。我不知道如何让解析器自动执行此操作。

1 个答案:

答案 0 :(得分:2)

将函数调用分解为单独的子表达式:

function_call = fn + lpar + expr + rpar

然后向function_call添加一个解析操作,从expr_stack中弹出操作符和操作数,然后将它们推回到堆栈中:

  • 如果是操作数,则按操作数然后执行
  • 如果是操作员,请按操作员

由于您只进行二元操作,因此最好先做一个简单的方法:

expr = Forward()
identifier = Word(alphas+'_', alphanums+'_')
expr = Forward()
function_call = Group(identifier + LPAR + Group(expr) + RPAR)

unop = oneOf("+ -")
binop = oneOf("+ - * / %")
operand = Group(Optional(unop) + (function_call | number | identifier))
binexpr = operand + binop + operand

expr << (binexpr | operand)

bnf = expr

这使您可以使用更简单的结构,而无需使用exprstack。

def test(s):
    exprtokens = bnf.parseString(s,parseAll=True)
    print exprtokens

test("10")
test("10+20")
test("avg(10)")
test("avg(+10)")
test("column_1+8")
test("avg(column_1+10)+10")

给出:

[['10']]
[['10'], '+', ['20']]
[[['avg', [['10']]]]]
[[['avg', [['+', '10']]]]]
[['column_1'], '+', ['8']]
[[['avg', [['column_1'], '+', ['10']]]], '+', ['10']]

您希望将fn(a op b)扩展为fn(a) op fn(b),但fn(a)应该保持不变,因此您需要测试已解析的表达式参数的长度:

def distribute_function(tokens):
    # unpack function name and arguments
    fname, args = tokens[0]

    # if args contains an expression, expand it
    if len(args) > 1:
        ret = ParseResults([])
        for i,a in enumerate(args):
            if i % 2 == 0:
                # even args are operands to be wrapped in the function
                ret += ParseResults([ParseResults([fname,ParseResults([a])])])
            else:
                # odd args are operators, just add them to the results
                ret += ParseResults([a])
        return ParseResults([ret])
function_call.setParseAction(distribute_function)        

现在您对测试的调用将如下所示:

[['10']]
[['10'], '+', ['20']]
[[['avg', [['10']]]]]
[[['avg', [['+', '10']]]]]
[['column_1'], '+', ['8']]
[[[['avg', [['column_1']]], '+', ['avg', [['10']]]]], '+', ['10']]

这甚至应该通过fna(fnb(3+2)+fnc(4+9))等调用递归工作。