我刚刚阅读了一篇文章,讨论如何在python中实现解析器: http://effbot.org/zone/simple-top-down-parsing.htm
本文描述了代码背后的一般概念:http://mauke.hopto.org/stuff/papers/p41-pratt.pdf
在python中编写解析器相当新,所以我试着写一些类似于学习练习的东西。但是,当我尝试编写类似于文章中的内容的代码时,我得到的是TypeError: unbound method TypeError
。这是我第一次遇到这样的错误而且我花了一整天时间试图解决这个问题,但我还没有解决这个问题。这是一个有这个问题的最小代码示例(完整的):
import re
class Symbol_base(object):
""" A base class for all symbols"""
id = None # node/token type name
value = None #used by literals
first = second = third = None #used by tree nodes
def nud(self):
""" A default implementation for nud """
raise SyntaxError("Syntax error (%r)." % self.id)
def led(self,left):
""" A default implementation for led """
raise SyntaxError("Unknown operator (%r)." % self.id)
def __repr__(self):
if self.id == "(name)" or self.id == "(literal)":
return "(%s %s)" % (self.id[1:-1], self.value)
out = [self.id, self.first, self.second, self.third]
out = map(str, filter(None,out))
return "(" + " ".join(out) + ")"
symbol_table = {}
def symbol(id, bindingpower=0):
""" If a given symbol is found in the symbol_table return it.
If the symblo cannot be found theni create the appropriate class
and add that to the symbol_table."""
try:
s = symbol_table[id]
except KeyError:
class s(Symbol_base):
pass
s.__name__ = "symbol:" + id #for debugging purposes
s.id = id
s.lbp = bindingpower
symbol_table[id] = s
else:
s.lbp = max(bindingpower,s.lbp)
return s
def infix(id, bp):
""" Helper function for defining the symbols for infix operations """
def infix_led(self, left):
self.first = left
self.second = expression(bp)
return self
symbol(id, bp).led = infix_led
#define all the symbols
infix("+", 10)
symbol("(literal)").nud = lambda self: self #literal values must return the symbol itself
symbol("(end)")
token_pat = re.compile("\s*(?:(\d+)|(.))")
def tokenize(program):
for number, operator in token_pat.findall(program):
if number:
symbol = symbol_table["(literal)"]
s = symbol()
s.value = number
yield s
else:
symbol = symbol_table.get(operator)
if not symbol:
raise SyntaxError("Unknown operator")
yield symbol
symbol = symbol_table["(end)"]
yield symbol()
def expression(rbp = 0):
global token
t = token
token = next()
left = t.nud()
while rbp < token.lbp:
t = token
token = next()
left = t.led(left)
return left
def parse(program):
global token, next
next = tokenize(program).next
token = next()
return expression()
def __main__():
print parse("1 + 2")
if __name__ == "__main__":
__main__()
当我尝试用pypy运行时:
Traceback (most recent call last):
File "app_main.py", line 72, in run_toplevel
File "parser_code_issue.py", line 93, in <module>
__main__()
File "parser_code_issue.py", line 90, in __main__
print parse("1 + 2")
File "parser_code_issue.py", line 87, in parse
return expression()
File "parser_code_issue.py", line 81, in expression
left = t.led(left)
TypeError: unbound method infix_led() must be called with symbol:+ instance as first argument (got symbol:(literal) instance instead)
我猜这是因为我没有为infix
操作创建实例,但我真的不想在那时创建实例。有没有办法在不创建实例的情况下更改这些方法?
非常感谢帮助解释为什么会发生这种情况以及我可以采取哪些措施来修复代码!
此行为是否会在python 3中发生变化?
答案 0 :(得分:3)
您忘了在tokenize()
函数中创建符号实例;如果不是数字,则为symbol()
,而不是symbol
:
else:
symbol = symbol_table.get(operator)
if not symbol:
raise SyntaxError("Unknown operator")
yield symbol()
通过这一次更改您的代码打印:
(+ (literal 1) (literal 2))
答案 1 :(得分:1)
您尚未将新功能绑定到对象的实例。
import types
obj = symbol(id, bp)
obj.led = types.MethodType(infix_led, obj)
的已接受答案