我使用的第一个解析器生成器是Parse :: RecDescent,它可用的指南/教程很棒,但它最有用的功能是它的调试工具,特别是跟踪功能(通过将$ RD_TRACE设置为1激活)。我正在寻找一个解析器生成器,可以帮助您调试它的规则。
问题是,它必须用python或ruby编写,并且具有详细的模式/跟踪模式或非常有用的调试技术。
有谁知道这样的解析器生成器?
编辑:当我说调试时,我没有提到调试python或ruby。我指的是调试解析器生成器,看看它在每一步中做了什么,看看它正在读取的每个字符,规则它正在尝试匹配。希望你明白这一点。BOUNTY EDIT:为了赢得赏金,请展示一个解析器生成器框架,并说明它的一些调试功能。我再说一遍,我对pdb不感兴趣,但是在解析器的调试框架中。另外,请不要提及树梢。我对它不感兴趣。
答案 0 :(得分:6)
Python是一种非常容易调试的语言。你可以直接导入pdb pdb.settrace()。
但是,这些解析器生成器应该具有良好的调试功能。
http://pyparsing.wikispaces.com/
回应赏金
这是PLY调试的实际操作。
源代码
tokens = (
'NAME','NUMBER',
)
literals = ['=','+','-','*','/', '(',')']
# Tokens
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
def t_NUMBER(t):
r'\d+'
t.value = int(t.value)
return t
t_ignore = " \t"
def t_newline(t):
r'\n+'
t.lexer.lineno += t.value.count("\n")
def t_error(t):
print("Illegal character '%s'" % t.value[0])
t.lexer.skip(1)
# Build the lexer
import ply.lex as lex
lex.lex(debug=1)
# Parsing rules
precedence = (
('left','+','-'),
('left','*','/'),
('right','UMINUS'),
)
# dictionary of names
names = { }
def p_statement_assign(p):
'statement : NAME "=" expression'
names[p[1]] = p[3]
def p_statement_expr(p):
'statement : expression'
print(p[1])
def p_expression_binop(p):
'''expression : expression '+' expression
| expression '-' expression
| expression '*' expression
| expression '/' expression'''
if p[2] == '+' : p[0] = p[1] + p[3]
elif p[2] == '-': p[0] = p[1] - p[3]
elif p[2] == '*': p[0] = p[1] * p[3]
elif p[2] == '/': p[0] = p[1] / p[3]
def p_expression_uminus(p):
"expression : '-' expression %prec UMINUS"
p[0] = -p[2]
def p_expression_group(p):
"expression : '(' expression ')'"
p[0] = p[2]
def p_expression_number(p):
"expression : NUMBER"
p[0] = p[1]
def p_expression_name(p):
"expression : NAME"
try:
p[0] = names[p[1]]
except LookupError:
print("Undefined name '%s'" % p[1])
p[0] = 0
def p_error(p):
if p:
print("Syntax error at '%s'" % p.value)
else:
print("Syntax error at EOF")
import ply.yacc as yacc
yacc.yacc()
import logging
logging.basicConfig(
level=logging.INFO,
filename="parselog.txt"
)
while 1:
try:
s = raw_input('calc > ')
except EOFError:
break
if not s: continue
yacc.parse(s, debug=1)
输出
lex: tokens = ('NAME', 'NUMBER')
lex: literals = ['=', '+', '-', '*', '/', '(', ')']
lex: states = {'INITIAL': 'inclusive'}
lex: Adding rule t_NUMBER -> '\d+' (state 'INITIAL')
lex: Adding rule t_newline -> '\n+' (state 'INITIAL')
lex: Adding rule t_NAME -> '[a-zA-Z_][a-zA-Z0-9_]*' (state 'INITIAL')
lex: ==== MASTER REGEXS FOLLOW ====
lex: state 'INITIAL' : regex[0] = '(?P<t_NUMBER>\d+)|(?P<t_newline>\n+)|(?P<t_NAME>[a-zA-Z
_][a-zA-Z0-9_]*)'
calc > 2+3
PLY: PARSE DEBUG START
State : 0
Stack : . LexToken(NUMBER,2,1,0)
Action : Shift and goto state 3
State : 3
Stack : NUMBER . LexToken(+,'+',1,1)
Action : Reduce rule [expression -> NUMBER] with [2] and goto state 9
Result : <int @ 0x1a1896c> (2)
State : 6
Stack : expression . LexToken(+,'+',1,1)
Action : Shift and goto state 12
State : 12
Stack : expression + . LexToken(NUMBER,3,1,2)
Action : Shift and goto state 3
State : 3
Stack : expression + NUMBER . $end
Action : Reduce rule [expression -> NUMBER] with [3] and goto state 9
Result : <int @ 0x1a18960> (3)
State : 18
Stack : expression + expression . $end
Action : Reduce rule [expression -> expression + expression] with [2,'+',3] and goto state
3
Result : <int @ 0x1a18948> (5)
State : 6
Stack : expression . $end
Action : Reduce rule [statement -> expression] with [5] and goto state 2
5
Result : <NoneType @ 0x1e1ccef4> (None)
State : 4
Stack : statement . $end
Done : Returning <NoneType @ 0x1e1ccef4> (None)
PLY: PARSE DEBUG END
calc >
在parser.out
生成的解析表Created by PLY version 3.2 (http://www.dabeaz.com/ply)
Grammar
Rule 0 S' -> statement
Rule 1 statement -> NAME = expression
Rule 2 statement -> expression
Rule 3 expression -> expression + expression
Rule 4 expression -> expression - expression
Rule 5 expression -> expression * expression
Rule 6 expression -> expression / expression
Rule 7 expression -> - expression
Rule 8 expression -> ( expression )
Rule 9 expression -> NUMBER
Rule 10 expression -> NAME
Terminals, with rules where they appear
( : 8
) : 8
* : 5
+ : 3
- : 4 7
/ : 6
= : 1
NAME : 1 10
NUMBER : 9
error :
Nonterminals, with rules where they appear
expression : 1 2 3 3 4 4 5 5 6 6 7 8
statement : 0
Parsing method: LALR
state 0
(0) S' -> . statement
(1) statement -> . NAME = expression
(2) statement -> . expression
(3) expression -> . expression + expression
(4) expression -> . expression - expression
(5) expression -> . expression * expression
(6) expression -> . expression / expression
(7) expression -> . - expression
(8) expression -> . ( expression )
(9) expression -> . NUMBER
(10) expression -> . NAME
NAME shift and go to state 1
- shift and go to state 2
( shift and go to state 5
NUMBER shift and go to state 3
expression shift and go to state 6
statement shift and go to state 4
state 1
(1) statement -> NAME . = expression
(10) expression -> NAME .
= shift and go to state 7
+ reduce using rule 10 (expression -> NAME .)
- reduce using rule 10 (expression -> NAME .)
* reduce using rule 10 (expression -> NAME .)
/ reduce using rule 10 (expression -> NAME .)
$end reduce using rule 10 (expression -> NAME .)
state 2
(7) expression -> - . expression
(3) expression -> . expression + expression
(4) expression -> . expression - expression
(5) expression -> . expression * expression
(6) expression -> . expression / expression
(7) expression -> . - expression
(8) expression -> . ( expression )
(9) expression -> . NUMBER
(10) expression -> . NAME
- shift and go to state 2
( shift and go to state 5
NUMBER shift and go to state 3
NAME shift and go to state 8
expression shift and go to state 9
state 3
(9) expression -> NUMBER .
+ reduce using rule 9 (expression -> NUMBER .)
- reduce using rule 9 (expression -> NUMBER .)
* reduce using rule 9 (expression -> NUMBER .)
/ reduce using rule 9 (expression -> NUMBER .)
$end reduce using rule 9 (expression -> NUMBER .)
) reduce using rule 9 (expression -> NUMBER .)
state 4
(0) S' -> statement .
state 5
(8) expression -> ( . expression )
(3) expression -> . expression + expression
(4) expression -> . expression - expression
(5) expression -> . expression * expression
(6) expression -> . expression / expression
(7) expression -> . - expression
(8) expression -> . ( expression )
(9) expression -> . NUMBER
(10) expression -> . NAME
- shift and go to state 2
( shift and go to state 5
NUMBER shift and go to state 3
NAME shift and go to state 8
expression shift and go to state 10
state 6
(2) statement -> expression .
(3) expression -> expression . + expression
(4) expression -> expression . - expression
(5) expression -> expression . * expression
(6) expression -> expression . / expression
$end reduce using rule 2 (statement -> expression .)
+ shift and go to state 12
- shift and go to state 11
* shift and go to state 13
/ shift and go to state 14
state 7
(1) statement -> NAME = . expression
(3) expression -> . expression + expression
(4) expression -> . expression - expression
(5) expression -> . expression * expression
(6) expression -> . expression / expression
(7) expression -> . - expression
(8) expression -> . ( expression )
(9) expression -> . NUMBER
(10) expression -> . NAME
- shift and go to state 2
( shift and go to state 5
NUMBER shift and go to state 3
NAME shift and go to state 8
expression shift and go to state 15
state 8
(10) expression -> NAME .
+ reduce using rule 10 (expression -> NAME .)
- reduce using rule 10 (expression -> NAME .)
* reduce using rule 10 (expression -> NAME .)
/ reduce using rule 10 (expression -> NAME .)
$end reduce using rule 10 (expression -> NAME .)
) reduce using rule 10 (expression -> NAME .)
state 9
(7) expression -> - expression .
(3) expression -> expression . + expression
(4) expression -> expression . - expression
(5) expression -> expression . * expression
(6) expression -> expression . / expression
+ reduce using rule 7 (expression -> - expression .)
- reduce using rule 7 (expression -> - expression .)
* reduce using rule 7 (expression -> - expression .)
/ reduce using rule 7 (expression -> - expression .)
$end reduce using rule 7 (expression -> - expression .)
) reduce using rule 7 (expression -> - expression .)
! + [ shift and go to state 12 ]
! - [ shift and go to state 11 ]
! * [ shift and go to state 13 ]
! / [ shift and go to state 14 ]
state 10
(8) expression -> ( expression . )
(3) expression -> expression . + expression
(4) expression -> expression . - expression
(5) expression -> expression . * expression
(6) expression -> expression . / expression
) shift and go to state 16
+ shift and go to state 12
- shift and go to state 11
* shift and go to state 13
/ shift and go to state 14
state 11
(4) expression -> expression - . expression
(3) expression -> . expression + expression
(4) expression -> . expression - expression
(5) expression -> . expression * expression
(6) expression -> . expression / expression
(7) expression -> . - expression
(8) expression -> . ( expression )
(9) expression -> . NUMBER
(10) expression -> . NAME
- shift and go to state 2
( shift and go to state 5
NUMBER shift and go to state 3
NAME shift and go to state 8
expression shift and go to state 17
state 12
(3) expression -> expression + . expression
(3) expression -> . expression + expression
(4) expression -> . expression - expression
(5) expression -> . expression * expression
(6) expression -> . expression / expression
(7) expression -> . - expression
(8) expression -> . ( expression )
(9) expression -> . NUMBER
(10) expression -> . NAME
- shift and go to state 2
( shift and go to state 5
NUMBER shift and go to state 3
NAME shift and go to state 8
expression shift and go to state 18
state 13
(5) expression -> expression * . expression
(3) expression -> . expression + expression
(4) expression -> . expression - expression
(5) expression -> . expression * expression
(6) expression -> . expression / expression
(7) expression -> . - expression
(8) expression -> . ( expression )
(9) expression -> . NUMBER
(10) expression -> . NAME
- shift and go to state 2
( shift and go to state 5
NUMBER shift and go to state 3
NAME shift and go to state 8
expression shift and go to state 19
state 14
(6) expression -> expression / . expression
(3) expression -> . expression + expression
(4) expression -> . expression - expression
(5) expression -> . expression * expression
(6) expression -> . expression / expression
(7) expression -> . - expression
(8) expression -> . ( expression )
(9) expression -> . NUMBER
(10) expression -> . NAME
- shift and go to state 2
( shift and go to state 5
NUMBER shift and go to state 3
NAME shift and go to state 8
expression shift and go to state 20
state 15
(1) statement -> NAME = expression .
(3) expression -> expression . + expression
(4) expression -> expression . - expression
(5) expression -> expression . * expression
(6) expression -> expression . / expression
$end reduce using rule 1 (statement -> NAME = expression .)
+ shift and go to state 12
- shift and go to state 11
* shift and go to state 13
/ shift and go to state 14
state 16
(8) expression -> ( expression ) .
+ reduce using rule 8 (expression -> ( expression ) .)
- reduce using rule 8 (expression -> ( expression ) .)
* reduce using rule 8 (expression -> ( expression ) .)
/ reduce using rule 8 (expression -> ( expression ) .)
$end reduce using rule 8 (expression -> ( expression ) .)
) reduce using rule 8 (expression -> ( expression ) .)
state 17
(4) expression -> expression - expression .
(3) expression -> expression . + expression
(4) expression -> expression . - expression
(5) expression -> expression . * expression
(6) expression -> expression . / expression
+ reduce using rule 4 (expression -> expression - expression .)
- reduce using rule 4 (expression -> expression - expression .)
$end reduce using rule 4 (expression -> expression - expression .)
) reduce using rule 4 (expression -> expression - expression .)
* shift and go to state 13
/ shift and go to state 14
! * [ reduce using rule 4 (expression -> expression - expression .) ]
! / [ reduce using rule 4 (expression -> expression - expression .) ]
! + [ shift and go to state 12 ]
! - [ shift and go to state 11 ]
state 18
(3) expression -> expression + expression .
(3) expression -> expression . + expression
(4) expression -> expression . - expression
(5) expression -> expression . * expression
(6) expression -> expression . / expression
+ reduce using rule 3 (expression -> expression + expression .)
- reduce using rule 3 (expression -> expression + expression .)
$end reduce using rule 3 (expression -> expression + expression .)
) reduce using rule 3 (expression -> expression + expression .)
* shift and go to state 13
/ shift and go to state 14
! * [ reduce using rule 3 (expression -> expression + expression .) ]
! / [ reduce using rule 3 (expression -> expression + expression .) ]
! + [ shift and go to state 12 ]
! - [ shift and go to state 11 ]
state 19
(5) expression -> expression * expression .
(3) expression -> expression . + expression
(4) expression -> expression . - expression
(5) expression -> expression . * expression
(6) expression -> expression . / expression
+ reduce using rule 5 (expression -> expression * expression .)
- reduce using rule 5 (expression -> expression * expression .)
* reduce using rule 5 (expression -> expression * expression .)
/ reduce using rule 5 (expression -> expression * expression .)
$end reduce using rule 5 (expression -> expression * expression .)
) reduce using rule 5 (expression -> expression * expression .)
! + [ shift and go to state 12 ]
! - [ shift and go to state 11 ]
! * [ shift and go to state 13 ]
! / [ shift and go to state 14 ]
state 20
(6) expression -> expression / expression .
(3) expression -> expression . + expression
(4) expression -> expression . - expression
(5) expression -> expression . * expression
(6) expression -> expression . / expression
+ reduce using rule 6 (expression -> expression / expression .)
- reduce using rule 6 (expression -> expression / expression .)
* reduce using rule 6 (expression -> expression / expression .)
/ reduce using rule 6 (expression -> expression / expression .)
$end reduce using rule 6 (expression -> expression / expression .)
) reduce using rule 6 (expression -> expression / expression .)
! + [ shift and go to state 12 ]
! - [ shift and go to state 11 ]
! * [ shift and go to state 13 ]
! / [ shift and go to state 14 ]
答案 1 :(得分:2)
我对它的调试功能一无所知,但我听说过有关PyParsing的好消息。
答案 2 :(得分:2)
我知道已经声明了赏金,但这里是一个用pyparsing编写的等效解析器(加上支持带有零个或多个逗号分隔参数的函数调用):
from pyparsing import *
LPAR, RPAR = map(Suppress,"()")
EQ = Literal("=")
name = Word(alphas, alphanums+"_").setName("name")
number = Word(nums).setName("number")
expr = Forward()
operand = Optional('-') + (Group(name + LPAR +
Group(Optional(delimitedList(expr))) +
RPAR) |
name |
number |
Group(LPAR + expr + RPAR))
binop = oneOf("+ - * / **")
expr << (Group(operand + OneOrMore(binop + operand)) | operand)
assignment = name + EQ + expr
statement = assignment | expr
此测试代码通过其基本步骤运行解析器:
tests = """\
sin(pi/2)
y = mx+b
E = mc ** 2
F = m*a
x = x0 + v*t +a*t*t/2
1 - sqrt(sin(t)**2 + cos(t)**2)""".splitlines()
for t in tests:
print t.strip()
print statement.parseString(t).asList()
print
给出这个输出:
sin(pi/2)
[['sin', [['pi', '/', '2']]]]
y = mx+b
['y', '=', ['mx', '+', 'b']]
E = mc ** 2
['E', '=', ['mc', '**', '2']]
F = m*a
['F', '=', ['m', '*', 'a']]
x = x0 + v*t +a*t*t/2
['x', '=', ['x0', '+', 'v', '*', 't', '+', 'a', '*', 't', '*', 't', '/', '2']]
1 - sqrt(sin(t)**2 + cos(t)**2)
[['1', '-', ['sqrt', [[['sin', ['t']], '**', '2', '+', ['cos', ['t']], '**', '2']]]]]
为了进行调试,我们添加以下代码:
# enable debugging for name and number expressions
name.setDebug()
number.setDebug()
现在我们重新解析第一个测试(显示输入字符串和一个简单的列标尺):
t = tests[0]
print ("1234567890"*10)[:len(t)]
print t
statement.parseString(t)
print
给出这个输出:
1234567890123
sin(pi/2)
Match name at loc 4(1,5)
Matched name -> ['sin']
Match name at loc 4(1,5)
Matched name -> ['sin']
Match name at loc 8(1,9)
Matched name -> ['pi']
Match name at loc 8(1,9)
Matched name -> ['pi']
Match name at loc 11(1,12)
Exception raised:Expected name (at char 11), (line:1, col:12)
Match name at loc 11(1,12)
Exception raised:Expected name (at char 11), (line:1, col:12)
Match number at loc 11(1,12)
Matched number -> ['2']
Match name at loc 4(1,5)
Matched name -> ['sin']
Match name at loc 8(1,9)
Matched name -> ['pi']
Match name at loc 8(1,9)
Matched name -> ['pi']
Match name at loc 11(1,12)
Exception raised:Expected name (at char 11), (line:1, col:12)
Match name at loc 11(1,12)
Exception raised:Expected name (at char 11), (line:1, col:12)
Match number at loc 11(1,12)
Matched number -> ['2']
Pyparsing还支持packrat解析,一种解析时的memoization(详细了解packratting here)。这是相同的解析序列,但启用了packrat:
same parse, but with packrat parsing enabled
1234567890123
sin(pi/2)
Match name at loc 4(1,5)
Matched name -> ['sin']
Match name at loc 8(1,9)
Matched name -> ['pi']
Match name at loc 8(1,9)
Matched name -> ['pi']
Match name at loc 11(1,12)
Exception raised:Expected name (at char 11), (line:1, col:12)
Match name at loc 11(1,12)
Exception raised:Expected name (at char 11), (line:1, col:12)
Match number at loc 11(1,12)
Matched number -> ['2']
这是一个有趣的练习,有助于我看到其他解析器库的调试功能。
答案 3 :(得分:1)
上面的ANTLR具有生成人类可读且易于理解的代码的优势, 因为它是(一个非常复杂和强大的)自上而下的解析器, 所以你可以使用常规调试器逐步完成它 并看看真正正在做什么。
这就是为什么它是我选择的解析器生成器。
像PLY这样自下而上的解析器生成器具有劣势 对于较大的语法,几乎不可能理解 调试输出真正意味着什么以及为什么 解析表就像它一样。
答案 4 :(得分:0)
Python wiki有一个list语言分析器用Python编写。