Question

我正在做一些lxml代码，但不明白它们之间的区别是什么 - 我想在父母的正下方选择孩子：

 xml = '<parent><child></child><parent>'
 root = lxml.etree.fromstring(xml)

 root.xpath('child')

和＆＃39; ./ child＆＃39;：

 root.xpath('child')

Answer 1

在这种情况下，表达式child和./child会给出相同的结果。这是因为child隐式假定上下文节点，在XPath中称为.。要查看Python / lxml /您的文档中的上下文节点，只需评估.。在纠正XML文档中的拼写错误之后会导致格式错误：

>>> xml = '<parent><child></child></parent>'
>>> root = lxml.etree.fromstring(xml)
>>> root.xpath('.')
[<Element parent at 0x1038446c8>]

如您所见，parent元素是针对此文档计算的任何XPath表达式的隐式上下文。

但是./不能总是从表达式中省略，有些情况下是必要的。例如，如果要搜索上下文节点以外的元素的所有后代，则使用.//descendant和//descendant可能会导致错误的结果。

例如，假设您希望找到元素other，如果它是child元素的后代，则不然。您的文档可能如下所示：

>>> xml = '<parent><other find="no"/><child><other find="yes"/></child></parent>'
>>> root = lxml.etree.fromstring(xml)

您首先要查找child元素：

>>> child = root.xpath('child')[0]

然后使用此元素作为上下文评估XPath表达式：

>>> child.xpath('//other')
[<Element other at 0x1038446c8>, <Element other at 0x105380348>]
>>> child.xpath('.//other')
[<Element other at 0x105380348>]

在这种情况下，XPath表达式开头的.实际上会对结果列表产生影响，只有.//other才会返回正确的结果。

Answer 2

当你选择没有轴的元素时，它是缩写语法：

child::the_element

等于

the_element

当前背景

以句点和正斜杠（./）为前缀的表达式显式使用当前上下文作为上下文。例如，以下表达式引用当前上下文中的所有元素：

./author

请注意，这相当于以下内容：

author

如何使用XPath lxml Python选择直接子项？

2 个答案: