如果子元素没有文本值,则XPath不返回任何内容。在这种情况下,评级没有数据,所以我希望它这样说 - 这个孩子中没有或没有,而不是忽略它。非常感谢您的意见。
XML:
<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
<book>
<title lang="eng">Harry Potter</title>
<price>29.99</price>
<rating></rating>
</book>
<book>
<title lang="hindi">Learning XML</title>
<price>39.95</price>
<rating></rating>
</book>
</bookstore>
Python:
>>> import lxml.html as lh
>>> bk=open('book.xml','r')
>>> bkout=lh.parse(bk)
>>> bk.close()
>>> bkout.xpath('//book/*/text()')
['Harry Potter', '29.99', 'Learning XML', '39.95']
>>> bkout.xpath('//book/* and not(text())/text()')
True
期望输出:
['Harry Potter', '29.99', '', 'Learning XML', '39.95', '']
or
['Harry Potter', '29.99', None, 'Learning XML', '39.95', None]
答案 0 :(得分:4)
删除“text()”:
In [16]: [x.text for x in bk.xpath("//book/*")]
Out[16]: ['Harry Potter', '29.99', None, 'Learning XML', '39.95', None]