如何忽略pyparsing ParseException并继续?

时间:2019-04-22 15:47:56

标签: python text-processing pyparsing

我想忽略与所有预定义解析器都不匹配的文件中的行,然后继续。我想忽略的行范围很广,我无法检查并为每个行定义解析器。

一旦捕获到ParseException,我就通过try..except进行传递。但是,解析将立即停止。

try:
    return parser.parseFile(filename, parse_all)

except ParseException, err:
    msg = 'Error during parsing of {}, line {}'.format(filename, err.lineno)
    msg += '\n' + '-'*70 + '\n'
    msg += err.line + '\n'
    msg += ' '*(err.col-1) + '^\n'
    msg += '-'*70 + '\n' + err.msg
    err.msg = msg

    print(err.msg)
    pass

即使存在ParseException,我也想继续。

1 个答案:

答案 0 :(得分:2)

Pyparsing并没有真正的“继续出错”选项,因此您需要调整解析器,以使其首先不会引发ParseException。您可能要尝试的是在解析器中添加| SkipTo(LineEnd())('errors*')之类的东西作为最后的陷阱。然后,您可以查看错误结果名称,以了解哪些行误入歧途(或向该表达式添加解析动作以捕获更多内容,而不仅仅是当前行)。

import pyparsing as pp

era = "The" + pp.oneOf("Age Years") + "of" + pp.Word(pp.alphas)

era.runTests("""
    The Age of Enlightenment
    The Years of Darkness
    The Spanish Inquisition
    """)

打印:

The Age of Enlightenment
['The', 'Age', 'of', 'Enlightenment']

The Years of Darkness
['The', 'Years', 'of', 'Darkness']

The Spanish Inquisition
    ^
FAIL: Expected Age | Years (at char 4), (line:1, col:5)

添加以下行,然后再次调用runTests:

# added to handle lines that don't match
unexpected = pp.SkipTo(pp.LineEnd(), include=True)("no_one_expects")
era = era | unexpected

打印:

The Age of Enlightenment
['The', 'Age', 'of', 'Enlightenment']

The Years of Darkness
['The', 'Years', 'of', 'Darkness']

The Spanish Inquisition
['The Spanish Inquisition']
 - no_one_expects: 'The Spanish Inquisition'