Python:使用lxml解析XML

时间:2014-11-06 09:20:19

标签: python xml parsing xml-parsing lxml

我正在尝试使用名为dblp-python的库。该库解析DBLP数据(XML格式)。当我试图打印作者的所有出版物时,该剧本很奇怪。有时它会打印它们而没有任何错误,有时它会显示错误。如果我在错误后多次运行相同的代码,它会显示出版物没有任何问题。 我使用的代码是:

a = dblp.search('Michael L. Littman')
for i in range(len(a[0].publications)):
    print i
    print a[0].publications[i].title

执行上述代码时出现的错误是:

> Traceback (mostrecent call last):   File "<pyshell#217>", line 3, in <module>
>     print a[0].publications[i].title   File "build\bdist.win32\egg\dblp\__init__.py", line 19, in __getattr__
>     self.load_data()   File "build\bdist.win32\egg\dblp\__init__.py", line 110, in load_data
>     root = etree.fromstring(xml)   File "lxml.etree.pyx", line 3092, in lxml.etree.fromstring (src\lxml\lxml.etree.c:70691)   File
> "parser.pxi", line 1828, in lxml.etree._parseMemoryDocument
> (src\lxml\lxml.etree.c:106689)   File "parser.pxi", line 1716, in
> lxml.etree._parseDoc (src\lxml\lxml.etree.c:105478)   File
> "parser.pxi", line 1086, in lxml.etree._BaseParser._parseDoc
> (src\lxml\lxml.etree.c:100105)   File "parser.pxi", line 580, in
> lxml.etree._ParserContext._handleParseResultDoc
> (src\lxml\lxml.etree.c:94543)   File "parser.pxi", line 690, in
> lxml.etree._handleParseResult (src\lxml\lxml.etree.c:96003)   File
> "parser.pxi", line 620, in lxml.etree._raiseParseError
> (src\lxml\lxml.etree.c:95050) XMLSyntaxError: Space required after the
> Public Identifier, line 2, column 47

可以看到图书馆的代码HERE。 我向提交人提出了这个问题,但没有回应。我希望任何人都可以帮助我,至少知道错误是什么。 谢谢

0 个答案:

没有答案