获得简约以打印序列块的有用错误消息

时间:2016-11-28 20:39:15

标签: python parsing peg parsimonious

我正在使用简约(python PEG解析器库)来解析看起来像这样的文本:

text = """
block block_name_0
{
    foo
}

block block_name_1
{
    bar
}

"""

这是一系列具有简单身体要求的块(必须是alphanum),构成整个文本。这是语法:

grammar = Grammar(r"""
file = block+
block = _ "block" _ alphanum _ start_brace _ block_body _ end_brace _
block_body = alphanum+
alphanum = ~"[_A-z0-9]+"
_ = ~"[\\n\\s]*"
start_brace = "{"
end_brace = "}"
""")

print (grammar.parse(text)) 

我遇到的问题是,如果在第一个块之后的任何块中存在解析错误,则会收到无用的错误消息。举个例子,请考虑以下文字:

text = """
block block_name_0
{
    !foo
}

block block_name_1
{
    bar
}

"""

这会给出一个有用的错误消息:

[omitted stack trace]
  File "/lib/parsimonious/expressions.py", line 127, in match
    raise error
parsimonious.exceptions.ParseError: Rule 'block_body' didn't match at '!foo
}

但是,如果我有以下文字:

text = """
block block_name_0
{
    foo
}

block block_name_1
{
    !bar
}

"""

我收到此错误:

  File "/lib/parsimonious/expressions.py", line 112, in parse
    raise IncompleteParseError(text, node.end, self)
parsimonious.exceptions.IncompleteParseError: Rule 'file' matched in its entirety, but it didn't consume all the text. The non-matching portion of the text begins with 'block block_name_1
{' (line 7, column 1).

看起来它匹配序列的第一个实例(第一个块),但是当它在第二个块上失败时,它不会将整个事件视为失败,这就是我想要它做的事情。我希望它给我一个与块0类似的错误,这样我就能确切地知道该块出了什么问题,而不仅仅是整个块无法解析。

非常感谢任何帮助!

1 个答案:

答案 0 :(得分:0)

不是简约的答案,但为了获得良好的错误报告支持,我建议您尝试textX或直接尝试其基础的PEG解析器Arpeggio(免责声明:我是这些库的作者)。

使用textX:

from textx.metamodel import metamodel_from_str

grammar = """
Program: blocks+=Block ;

Block:
 'block' name=ID '{'
     body=Body
 '}'
;

Body: ID+ ;
"""

text = """
block block_name_0
{
    foo
}

block block_name_1
{
    !bar
}

"""

mm = metamodel_from_str(grammar)
program = mm.model_from_str(text)

textX / Arpeggio将尽可能地解析并确定错误的确切位置:

textx.exceptions.TextXSyntaxError:
   Expected ID at position (9, 5) => 'e_1 {     *!bar }  '.

使用textX你也可以免费获得AST,所以你可以这样做:

for block in program.blocks:
    print(block.name, ':', block.body)

为了调试/调查目的,您还有一个nice visualization of grammars and models