Python简单解析树解释器

时间:2013-11-20 02:48:37

标签: python parsing tree interpreter

HI我有一个函数parse(),它接受​​一个运算符和操作数(ex ['+', '20', '10'])标记列表和一个索引i,并从中构造一个树。当找到运算符(具有.left.right的节点)时,该函数将向下运行,直到达到文字(数字)或字母变量。问题是,当左侧的递归完成并且函数移动到右侧时,我希望在递归内变化的索引i保持修改。例如,要从[ - ,//,y,2,x]获取此树:

enter image description here

我写了这个:

def parse(tokens, i):
    """parse: tuple(String) * int -> (Node, int)
    From an infix stream of tokens, and the current index into the
    token stream, construct and return the tree, as a collection of Nodes, 
    that represent the expression."""

    if tokens[i].isdigit():
        return mkLiteralNode(tokens[i])

    elif tokens[i] == "+":
        return mkAddNode(parse( tokens, i + 1 ), parse( tokens, i + 2 )) # first argument is the node.left, second is the node.right
    elif tokens[i] == "-":
        return mkSubtractNode(parse( tokens, i + 1 ), parse( tokens, i + 2 ))
    elif tokens[i] == "*":
        return mkMultiplyNode(parse( tokens, i + 1 ), parse( tokens, i + 2 ))
    elif tokens[i] == "//":
        return mkDivideNode(parse( tokens, i + 1 ), parse( tokens, i + 2 ))

    else:
        return mkVariableNode(tokens[i])

当制作SubtractNode时,i为0然后正常递增,但是当左侧完成时,我再次为0而parse(tokens, i + 2)指向y而不是x,这样做:< / p>

enter image description here

如何在令牌中不使用pop()来制作上述树?

2 个答案:

答案 0 :(得分:4)

当您将tokens列表视为对象时,编写此类内容会更容易,而这些列表本身也负责存储其位置。这就是为什么一般来说,tokens应来自Lexer,它通常只有一种方法:next_token。我们将使用迭代器来模拟它:

def parse(tokens):
    """parse: tokens_iter or generator -> Node
    From an infix stream of tokens, and the current index into the
    token stream, construct and return the tree, as a collection of Nodes,
    that represent the expression."""

    next_tok = next(tokens)

    if next_tok.isdigit():
        return ('literal', next_tok)

    elif next_tok == "+":
        return ('add', parse( tokens ), parse( tokens )) # first argument is the node.left, second is the node.right
    elif next_tok == "-":
        return ('sub', parse( tokens ), parse( tokens ))
    elif next_tok == "*":
        return ('mul', parse( tokens ), parse( tokens ))
    elif next_tok == "//":
        return ('div', parse( tokens ), parse( tokens ))

    else:
        return ('variable', next_tok )

# And, example:
print(parse(iter(['-', '//', 'y', '2', 'x'])))

那将打印:

('sub', ('div', ('variable', 'y'), ('literal', '2')), ('variable', 'x'))

您还需要处理StopIteration例外情况,并将其转换为有意义的ParseError

答案 1 :(得分:0)

parse返回下一个未使用的令牌以及解析树的索引,并使用它来确定继续解析的位置。例如:

left, rightIndex = parse(tokens, i+1)
right, returnIndex = parse(tokens, nextIndex)
return mkWhateverNode(left, right), returnIndex