'ZeroOrMore'明显的pyparsing错误

时间:2018-06-04 17:55:45

标签: python-3.x pyparsing

我在mac上使用python 3.6.5进行pyparsing。以下代码在第二个解析时崩溃:

from pyparsing import *

a = Word(alphas) + Literal(';')
b = Word(alphas) + Optional(Literal(';'))
bad_parser = ZeroOrMore(a) + b

b.parseString('hello;')
print("no problems yet...")
bad_parser.parseString('hello;')
print("this will not print because we're dead")

这是合乎逻辑的行为吗?或者这是一个错误?

编辑:这是完整的控制台输出:

no problems yet...
Traceback (most recent call last):
  File "test.py", line 9, in <module>
    bad_parser.parseString('hello;')
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyparsing.py", line 1632, in parseString
    raise exc
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyparsing.py", line 1622, in parseString
    loc, tokens = self._parse( instring, 0 )
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyparsing.py", line 1379, in _parseNoCache
    loc,tokens = self.parseImpl( instring, preloc, doActions )
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyparsing.py", line 3395, in parseImpl
    loc, exprtokens = e._parse( instring, loc, doActions )
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyparsing.py", line 1379, in _parseNoCache
    loc,tokens = self.parseImpl( instring, preloc, doActions )
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyparsing.py", line 2689, in parseImpl
    raise ParseException(instring, loc, self.errmsg, self)
pyparsing.ParseException: Expected W:(ABCD...) (at char 6), (line:1, col:7)

1 个答案:

答案 0 :(得分:2)

这是预期的行为。 Pyparsing不做任何前瞻,但纯粹是从左到右。您可以向您的解析器添加前瞻,但这是您必须为自己做的事情。

如果打开ab的调试,您可以更深入地了解正在发生的事情:

a.setName('a').setDebug()
b.setName('b').setDebug()

将显示每个地方pyparsing即将匹配表达式,然后如果匹配失败或成功,如果成功,匹配令牌:

Match a at loc 0(1,1)
Matched a -> ['hello', ';']
Match a at loc 6(1,7)
Exception raised:Expected W:(ABCD...) (at char 6), (line:1, col:7)
Match b at loc 6(1,7)
Exception raised:Expected W:(ABCD...) (at char 6), (line:1, col:7)

由于a匹配完整的输入字符串,因此符合&#34;零或更多&#34;的标准。然后pyparsing继续匹配b,但由于已经读取了单词和分号,因此不再需要解析。由于b不是可选的,因此pyparsing会引发无法找到的异常。即使你要解析&#34;你好;你好;你好;&#34;,所有的字符串和半决赛都将被消费 ZeroOrMore,没有留下要跟踪的b

试试这个:

not_so_bad_parser = ZeroOrMore(a + ~StringEnd()) + b

通过声明您只想读取不在字符串末尾的a表达式,然后解析&#34; hello;&#34;与a不匹配,因此请继续b,然后匹配。

这是一个非常普遍的问题,我将stopOn关键字添加到ZeroOrMore和OneOrMore类构造函数中,以避免添加公开~(意味着NotAny)。起初我认为这可行:

even_less_bad_parser = ZeroOrMore(a, stopOn=b) + b

但是,由于b也与a匹配,因此永远不会匹配任何a,并且可能会留下不匹配的文字。我们只需要在字符串的末尾停在b上:

even_less_bad_parser = ZeroOrMore(a, stopOn=b + StringEnd()) + b

我不确定这是否真的能满足你的“不那么糟糕”的概念,但这就是为什么pyparsing的行为和你一样。