Question

最近我需要在HTML文档的Node上评估一个XQuery。基本上，我需要从body元素的第一个子元素中选择具有href属性的所有元素。我添加了一个小例子来解释：

<html>
    <body>
        <a href="http://www.google.be"/>
    </body>
</html>

在这种情况下，所需的提取结果显然是：

<a href="http://www.google.be"/>

我的第一个想法是使用//body/*[1]//*[@href]因为：

我认为这样可行但是根据提供的示例，XQuery没有结果。

但是，我读了一下，发现了以下内容（来源：http://www.keller.com/xslt/8/）：

Alternate notation for "//": descendant-or-self::node()

所以我将我的XQuery更改为//body/*[1]/descendant-or-self::node()[@href]，这次，结果是正确的。

我的问题：//和descendant-or-self :: node（）之间有什么区别？我在这里找到的内容（What's the difference between //node and /descendant::node in xpath?）和此处（http://www.w3.org/TR/xpath/#axes）说：

//是/descendant-or-self::node()/的缩写。例如，//para是 /descendant-or-self::node()/child::para的缩写。

这让我得出结论//和/descendant-or-self::node()不可互换（可能是因为最后终止了/？），但是有人可以告诉我是否有/descendant-or-self::node()的简写？

Answer 1

您的第一个XPath表达式（//body/*[1]//*[@href]）实际上代表您在自然语言中描述的内容：//body/*[1]是body元素的第一个子元素，//*[@href]选择第一个元素（下面）拥有@href属性。

在您的示例中，锚标记下方没有具有此类属性的元素。例如，此查询将匹配

<html>
    <body>
        <p>
            <a href="http://www.google.be"/>
        </p>
    </body>
</html>

此查询的非缩写版本为：

//body/*[1]/descendant-or-self::node()/*[@href]

相反地，将第二个问题放在一起，问题应该很容易看出：

//body/*[1]/descendant-or-self::node()[@href]

Answer 2

我认为问题出在您的描述中，它似乎与您的示例不符！

鉴于输入：

<html>
    <body>
        <a href="http://www.google.be"/>
    </body>
</html>

和要求声明：

“具有来自body元素的第一个子元素的href属性的所有元素”

您的XPath配方：

//body/*[1]//*[@href]

符合您的要求声明。但是，预期的输出将是一个空序列，正如您所发现的那样...而不是您建议的输出：

<a href="http://www.google.be"/>

要获得建议的输出，您的XPath需求声明可能是：

“具有href属性的body元素的第一个子元素”，这将导致XPath：

//*[@href][parent::body][1]

从您的要求声明和不匹配的示例中，很难确切地确定您的意思。所以也许你的要求声明是：

“身体中具有href属性的第一个元素”

如果是这种情况，那么我建议使用XPath：

($input//*[@href][ancestor::body])[1]

请注意，序列构造函数，即'（'和'）'使后代序列变平，以允许您以类似于数组的方式寻址每个选定的后代。