Question

我在使用pyparsing解析算术表达式时遇到问题。我有以下语法：

numeric_value = (integer_format | float_format | bool_format)("value*")
identifier = Regex('[a-zA-Z_][a-zA-Z_0-9]*')("identifier*")

operand = numeric_value | identifier

expop = Literal('^')("op")
signop = oneOf('+ -')("op")
multop = oneOf('* /')("op")
plusop = oneOf('+ -')("op")
factop = Literal('!')("op")

arithmetic_expr = infixNotation(operand,
    [("!", 1, opAssoc.LEFT),
     ("^", 2, opAssoc.RIGHT),
     (signop, 1, opAssoc.RIGHT),
     (multop, 2, opAssoc.LEFT),
     (plusop, 2, opAssoc.LEFT),]
    )("expr")

我想用它来解析算术表达式，例如，

expr = "9 + 2 * 3"
parse_result = arithmetic_expr.parseString(expr)

我这里有两个问题。

首先，当我转储结果时，我得到以下内容：

[['9', '+', ['2', '*', '3']]]
- expr: ['9', '+', ['2', '*', '3']]
  - op: '+'
  - value: ['9']

相应的XML输出ist：

<result>
  <expr>
    <value>9</value>
    <op>+</op>
    <value>
      <value>2</value>
      <op>*</op>
      <value>3</value>
    </value>
  </expr>
</result>

我希望拥有的是['2', '*', '3']显示为expr，即

<result>
  <expr>
    <value>9</value>
    <op>+</op>
    <expr>
      <value>2</value>
      <op>*</op>
      <value>3</value>
    </expr>
  </expr>
</result>

但是，我不确定是否可以使用setResultName()来实现此目的。

其次，不幸的是，当我想迭代结果时，我获得了简单部分的字符串。因此，我使用XML＆＃34; hack＆＃34;作为一种解决方法（我从这里得到了这个想法：`pyparsing`: iterating over `ParsedResults` 现在有更好的方法吗？

祝你好运 APO

我还有一个关于如何解析结果的小问题。我的第一次尝试是使用循环，例如。

def recurse_arithmetic_expression(tokens):
    for t in tokens:
        if t.getResultName() == "value":
            pass # do something...
        elif t.getResultName() == "identifier":
            pass # do something else..
        elif t.getResultName() == "op":
            pass # do something completely different...
        elif isinstance(t, ParseResults):
            recurse_arithmetic_expression(t)

然而，不幸的是t可以是字符串或int / float。因此，当我尝试调用getResultName时出现异常。不幸的是，当我使用asDict时，令牌的顺序就会丢失。

是否有可能获得有序 dict并使用

之类的东西迭代其键

for tag, token in tokens.iteritems():

其中tag表示令牌的类型（例如op, value, identifier, expr...），令牌是相应的令牌？

Answer 1

如果您希望pyparsing将数字字符串转换为整数，则可以添加一个解析操作以在分析时完成。或者，使用pyparsing_common（使用pyparsing导入的命名空间类）中定义的预定义整数和浮点值：

numeric_value = (pyparsing_common.number | bool_format)("value*")

对于您的命名问题，您可以添加解析操作以在每个级别的infixNotation中运行 - 在下面的代码中，我添加了一个只添加＆＃39; expr＆＃39;的解析操作。命名为当前已解析的组。您还要添加＆＃39; *＆＃39;对你所有的操作，以便重复的操作员得到相同的＆＃34;保持所有，而不仅仅是最后的＆＃34;结果名称的行为：

bool_format = oneOf("true false")
numeric_value = (pyparsing_common.number | bool_format)("value*")
identifier = Regex('[a-zA-Z_][a-zA-Z_0-9]*')("identifier*")

operand = numeric_value | identifier

expop = Literal('^')("op*")
signop = oneOf('+ -')("op*")
multop = oneOf('* /')("op*")
plusop = oneOf('+ -')("op*")
factop = Literal('!')("op*")


def add_name(s,l,t):
    t['expr'] = t[0]

arithmetic_expr = infixNotation(operand,
    [("!", 1, opAssoc.LEFT, add_name),
     ("^", 2, opAssoc.RIGHT, add_name),
     (signop, 1, opAssoc.RIGHT, add_name),
     (multop, 2, opAssoc.LEFT, add_name),
     (plusop, 2, opAssoc.LEFT, add_name),]
    )("expr")

了解这些结果现在的样子：

arithmetic_expr.runTests("""
    9 + 2 * 3 * 7
""")

print(arithmetic_expr.parseString('9+2*3*7').asXML())

给出：

9 + 2 * 3 * 7
[[9, '+', [2, '*', 3, '*', 7]]]
- expr: [9, '+', [2, '*', 3, '*', 7]]
  - expr: [2, '*', 3, '*', 7]
    - op: ['*', '*']
    - value: [2, 3, 7]
  - op: ['+']
  - value: [9]


<expr>
  <expr>
    <value>9</value>
    <op>+</op>
    <expr>
      <value>2</value>
      <op>*</op>
      <value>3</value>
      <op>*</op>
      <value>7</value>
    </expr>
  </expr>
</expr>

注意：我一般不鼓励人们使用asXML，因为它必须做一些猜测来创建它的输出。您可能最好手动导航解析结果。另外，请查看pyparsing wiki Examples页面上的一些示例，尤其是SimpleBool.py，它使用infixNotation中使用的每级解析操作的类。

EDIT ::

此时，我真的想劝阻您继续使用结果名称来指导对已解析结果的评估。请查看这两种方法来解析已解析的令牌（请注意，您要查找的方法是getName，而不是getResultName）：

result = arithmetic_expr.parseString('9 + 2 * 4 * 6')

def iterate_over_parsed_expr(tokens):
    for t in tokens:
        if isinstance(t, ParseResults):
            tag = t.getName()
            print(t, 'is', tag)
            iterate_over_parsed_expr(t)
        else:
            print(t, 'is', type(t))

iterate_over_parsed_expr(result)

import operator
op_map = {
    '+' : operator.add,
    '-' : operator.sub,
    '*' : operator.mul,
    '/' : operator.truediv
    }
def eval_parsed_expr(tokens):
    t = tokens
    if isinstance(t, ParseResults):
        # evaluate initial value as left-operand
        cur_value = eval_parsed_expr(t[0])
        # iterate through remaining tokens, as operator-operand pairs
        for op, operand in zip(t[1::2], t[2::2]):
            # look up the correct binary function for operator
            op_func = op_map[op]
            # evaluate function, and update cur_value with result
            cur_value = op_func(cur_value, eval_parsed_expr(operand))

        # no more tokens, return the value
        return cur_value
    else:
        # token is just a scalar int or float, just return it
        return t

print(eval_parsed_expr(result))  # gives 57, which I think is the right answer

eval_parsed_expr依赖于解析的标记的结构，而不是结果名称。对于这种有限的情况，标记都是二元运算符，因此对于每个嵌套结构，生成的标记为＆＃34;值[op value] ...＆＃34;，值本身可以是整数，浮点数或嵌套的ParseResults - 但从不strs，至少不是我在这种方法中硬编码的4个二元运算符。不要试图通过特殊情况处理死亡以处理一元操作和右关联操作，而是通过将评估器类与每个操作数类型相关联来查看如何在eval_arith.py（http://pyparsing.wikispaces.com/file/view/eval_arith.py/68273277/eval_arith.py）中完成此操作。 infixNotation的等级。

pyparsing中setResultName的问题

1 个答案: