lxml.etree.XMLSyntaxError:内部错误:巨大的输入查找

时间:2018-02-26 08:36:05

标签: python xml xml-parsing lxml

我试图用python lxml库解析一个大的xml文件(~500 MB)iterparse,使用:

context = etree.iterparse('large-file.xml')
for event, element in context:
    # do some stuff 
    element.clear()

但它返回以下错误:

Traceback (most recent call last):
  File "test.py", line 176, in <module> test_parser()
  File "test.py", line 121, in test_parser
    for event, element in context:
  File "src/lxml/iterparse.pxi", line 208, in lxml.etree.iterparse.__next__ (src/lxml/etree.c:155963)
  File "src/lxml/iterparse.pxi", line 193, in lxml.etree.iterparse.__next__ (src/lxml/etree.c:155671)
  File "src/lxml/iterparse.pxi", line 228, in lxml.etree.iterparse._read_more_events (src/lxml/etree.c:156298)
  File "src/lxml/parser.pxi", line 1362, in lxml.etree._FeedParser.feed (src/lxml/etree.c:116552)
  File "src/lxml/parser.pxi", line 589, in lxml.etree._ParserContext._handleParseResult (src/lxml/etree.c:107619)
  File "src/lxml/parser.pxi", line 598, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/etree.c:107738)
  File "src/lxml/parser.pxi", line 709, in lxml.etree._handleParseResult (src/lxml/etree.c:109447)
  File "src/lxml/parser.pxi", line 638, in lxml.etree._raiseParseError (src/lxml/etree.c:108301)
  File "large-file.xml", line 20593
lxml.etree.XMLSyntaxError: internal error: Huge input lookup, line 20593, column 199

0 个答案:

没有答案