Question

给出标记：

<p>
  <code>foo</code><code>bar</code>
  <code>jim</code> and then <code>jam</code>
</p>

我需要选择前三个<code> - 但不是最后一个。逻辑是“选择具有前后兄弟元素的所有code元素，它们也是code，除非存在一个或多个文本节点它们之间的非空白内容。

鉴于我使用的是Nokogiri（使用libxml2），我只能使用XPath 1.0表达式。

虽然需要一个棘手的XPath表达式，但是在Nokogiri文档上执行相同操作的Ruby代码/迭代也是可以接受的。

请注意，CSS adjacent sibling selector会忽略非元素节点，因此选择nokodoc.css('code + code')将错误地选择最后一个<code>块。

Nokogiri.XML('<r><a/><b/> and <c/></r>').css('* + *').map(&:name)
#=> ["b", "c"]

修改：更多测试用例，为清晰起见：

<section><ul>
  <li>Go to <code>N</code> and
      then <code>Y</code><code>Y</code><code>Y</code>.
  </li>
  <li>If you see <code>N</code> or <code>N</code> then…</li>
</ul>
<p>Elsewhere there might be: <code>N</code></p>
<p><code>N</code> across parents.</p>
<p>Then: <code>Y</code> <code>Y</code><code>Y</code> and <code>N</code>.</p>
<p><code>N</code><br/><code>N</code> elements interrupt, too.</p>
</section>

应选择上面的所有Y。不应选择N。 <code>的内容仅用于指示应选择哪个内容：您不能使用该内容来确定是否选择元素。

<code>出现的上下文元素无关紧要。它们可能会显示在<li>中，它们可能会显示在中，它们可能会显示在其他内容中。

我想一次性选择<code>的所有连续运行。在一组Y的中间有一个空格字符并不是错误。

Answer 1

使用：

//code [preceding-sibling::node()[1][self::code] or preceding-sibling::node()[1] [self::text()[not(normalize-space())]] and preceding-sibling::node()[2][self::code] or following-sibling::node()[1][self::code] or following-sibling::node()[1] [self::text()[not(normalize-space())]] and following-sibling::node()[2][self::code] ]

基于XSLT的验证：

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:template match="/"> <xsl:copy-of select= "//code [preceding-sibling::node()[1][self::code] or preceding-sibling::node()[1] [self::text()[not(normalize-space())]] and preceding-sibling::node()[2][self::code] or following-sibling::node()[1][self::code] or following-sibling::node()[1] [self::text()[not(normalize-space())]] and following-sibling::node()[2][self::code] ]"/> </xsl:template> </xsl:stylesheet>

在提供的XML文档上应用此转换时：

<section><ul> <li>Go to <code>N</code> and then <code>Y</code><code>Y</code><code>Y</code>. </li> <li>If you see <code>N</code> or <code>N</code> then…</li> </ul> Elsewhere there might be: <code>N</code> <code>N</code> across parents. Then: <code>Y</code> <code>Y</code><code>Y</code> and <code>N</code>. <code>N</code> <code>N</code> elements interrupt, too. </section>

评估包含的XPath表达式，并将选定的节点复制到输出中：

<code>Y</code> <code>Y</code> <code>Y</code> <code>Y</code> <code>Y</code> <code>Y</code>

Answer 2

//code[
  (
    following-sibling::node()[1][self::code]
    or (
      following-sibling::node()[1][self::text() and normalize-space() = ""]
      and
      following-sibling::node()[2][self::code]
    )
  )
  or (
    preceding-sibling::node()[1][self::code]
    or (
      preceding-sibling::node()[1][self::text() and normalize-space() = ""]
      and
      preceding-sibling::node()[2][self::code]
    )
  )
]

我认为这样做你想要的，虽然我不会声称你真的想要使用它。

我假设文本节点总是合并在一起，因此不会有两个相邻的，我相信通常是这种情况，但如果你事先进行DOM操作可能不会。我还假设code元素之间不存在任何其他元素，或者如果它们阻止选择像非空白文本那样。

Answer 3

我认为这就是你想要的：

/p/code[not(preceding-sibling::text()[not(normalize-space(.)="")])]

选择相邻的兄弟元素而不插入非空白文本节点

3 个答案: