Question

我在xpath表达式中苦苦寻找特定类型的第n个深度后代。

细分问题可以这样描述：查找深度为2的所有<section> 或 <article>元素，忽略任何其他元素在路上。换句话说：深度仅计入section或article标记。

<body>
  <main>

    <section>
      <div>

        <article>this is to be selected
          <div>
            <section></section>
          </div>
        </article>

      </div>
    </section>

    <article>
      <div>
        <div>

          <section>this is to be selected
            <div>
              <section></section>
            </div>
          </section>

        </div>
      </div>
    </article>

  </main>
</body>

我所有糟糕和困惑的尝试至少没有解决问题，也不是很重要。有什么表达符合我的需要吗？

处理article 或 section会很棒，但是，通过近似处理限制为部分的文档的任何解决方案都是第一个一步。即使是后者，我也无法接近。

欢迎使用PHP的替代解决方案。我知道要遍历一个XML文档，但是，我正在寻找一个简短的表达。

Answer 1

如果我理解你，你正在寻找一个表达方式：

//*任何深度的任何元素
[self::article or self::section]即文章或部分
[*/*]，其子元素包含子元素
[not(*/*/*)]和没有有一个带有子元素的子元素的子元素

结合获得祖父母的文章和章节，但不是曾祖父母：

//*[(self::article or self::section) and */* and not(*/*/*)]

实施例

$dom = new DOMDocument();
$dom->loadXML($xml);
$xpath = new DOMXPath($dom);
$query = '//*[(self::article or self::section) and */* and not(*/*/*)]';

foreach ($xpath->query($query) as $node) {
    echo $dom->saveXML($node), "\n";
}

输出：

<article>this is to be selected
          <div>
            <section/>
          </div>
        </article>
<section>this is to be selected
            <div>
              <section/>
            </div>
          </section>

要将此扩展到n级后代，请动态生成xpath表达式：

$descendants_depth = 2;
$xfrag = rtrim(str_repeat("*/", $descendants_depth), "/");
$query = "//*[(self::article or self::section) and $xfrag and not($xfrag/*)]";

Answer 2

如果我从字面上理解您，您希望找到任何只有section或article

的祖先的section或article.

$document = new DOMDocument();
$document->loadXML($xml);
$xpath = new DOMXPath($document);
$expression = 
  '//*[
    (self::article or self::section) and 
    count(ancestor::*[self::article or self::section]) = 1
   ]';

foreach ($xpath->evaluate($expression) as $node) {
  echo $document->saveXML($node), "\n";
}

Xpath表达式

获取任何元素节点
//*
在article轴上有section或self（self轴包含当前节点本身）
//*[self::article or self::section]
并且有一个祖先元素节点
//*[(self::article or self::section) and count(ancestor::*) = 1]
在article轴上有section或self的情况 //*[(self::article or self::section) and count(ancestor::*[self::article or self::section]) = 1]

Axes定义位置路径使用的初始节点集。默认轴为child，因此article实际为child::article。

此方法也可用于获取特定节点的级别。

foreach ($xpath->evaluate('//*[self::article or self::section]') as $node) {
  $level = $xpath->evaluate('count(ancestor::*[self::article or self::section])', $node);
  echo $node->localName, ', level: ', $level, "\n";
}

输出：

section, level: 0
article, level: 1
section, level: 2
article, level: 0
section, level: 1
section, level: 2

如何形成一个表达式来选择特定类型的所有第n个深度后代

2 个答案:

实施例

输出：