对于嵌套的DOM元素,获取某个属性的所有值(例如所有类)

时间:2018-02-23 11:47:04

标签: php domdocument

使用DOMDocument,如何获取DOMelement的所有嵌套组类?

E.g。 $this->xmlComponent包含:

<span class="one two"><a class="three" href="#">test</a></span>

应该导致["one","two","three"]

2 个答案:

答案 0 :(得分:1)

您可以使用带有参数*的{​​{3}}来获取所有子标记,然后递归遍历所有子项,获取每个子项的class属性:

<?php
$html = '<span class="one two"><a class="three four" href="#"><b class="five">test</b></a></span>';
$dom = new DOMDocument;
$dom->loadHTML($html);
$spans = $dom->getElementsByTagName("span");
$values = [];
foreach ($spans as $span) {
    $values[] = $span->getAttribute("class");
    $values[] = getAllValues($span);
}

function getAllValues($node)
{
    $values = [];
    $children = $node->getElementsByTagName('*');
    foreach ($children as $child) {
        $values[] = $child->getAttribute("class");
        getAllValues($child);
    }
    return $values;
}

var_dump($values);

结果:

array (size=2)
  0 => string 'one two' (length=7)
  1 => 
    array (size=2)
      0 => string 'three four' (length=10)
      1 => string 'five' (length=4)

DOMElement::getElementsByTagName()

要让每个班级都有自己的元素,只需explode()一个空格:

foreach ($spans as $span) {
    $values[] = explode(" ", $span->getAttribute("class"));
    $values[] = getAllValues($span);
}

function getAllValues($node)
{
    $values = [];
    $children = $node->getElementsByTagName('*');
    foreach ($children as $child) {
        $values[] = explode(" ", $child->getAttribute("class"));
        getAllValues($child);
    }
    return $values;
}

结果

array (size=2)
  0 => 
    array (size=2)
      0 => string 'one' (length=3)
      1 => string 'two' (length=3)
  1 => 
    array (size=2)
      0 => 
        array (size=2)
          0 => string 'three' (length=5)
          1 => string 'four' (length=4)
      1 => 
        array (size=1)
          0 => string 'five' (length=4)

最后,要将所有内容都放在一个平面数组中,请通过引用传递getAllValues()初始$values数组:

foreach ($spans as $span) {
    $values[] = explode(" ", $span->getAttribute("class"));
    getAllValues($span, $values);
}

function getAllValues($node, &$values)
{
    $children = $node->getElementsByTagName('*');
    foreach ($children as $child) {
        $values[] = explode(" ", $child->getAttribute("class"));
        getAllValues($child, $values);
    }
    return $values;
}

结果

array (size=4)
  0 => 
    array (size=2)
      0 => string 'one' (length=3)
      1 => string 'two' (length=3)
  1 => 
    array (size=2)
      0 => string 'three' (length=5)
      1 => string 'four' (length=4)
  2 => 
    array (size=1)
      0 => string 'five' (length=4)
  3 => 
    array (size=1)
      0 => string 'five' (length=4)

并且(对不起,长篇大论)作为最终答案,因为你提到你需要一个&#34;某些&#34;属性,你可以使它成为一个函数,这样你就可以通过将一个参数传递给函数(默认是类)来获取任何属性,获得包含所有值的数组:

function getAllAttributesForNode($node, $attribute = "class")
{
    $values = [];
    foreach ($node as $child) {
        getAllValues($child, $values, $attribute);
    }

    return $values;
}

function getAllValues($node, &$values, $attribute)
{
    $values[] = explode(" ", $node->getAttribute($attribute));

    $children = $node->getElementsByTagName('*');
    foreach ($children as $child) {
        getAllValues($child, $values, $attribute);
    }

    return $values;
}

$spans = $dom->getElementsByTagName("span");
$values = getAllAttributesForNode($spans);

答案 1 :(得分:0)

感谢ishegg建议的方法 以下是我最终使用的精简代码:

function getAllAttributes($nodes, $values, $attribute='class') {
    foreach ($nodes as $node) {
        $values = array_merge($values, explode(" ", $node->getAttribute($attribute)) );
        $children = $node->getElementsByTagName('*');
        $values = getAllAttributes($children, $values);
    }
    return $values; 
}

$html = '<span class="one two"><a class="three" href="#">test</a></span>';
$dom = new DOMDocument;
$dom->loadHTML($html);
$spans = $dom->getElementsByTagName("span");

$classes = getAllAttributes($spans, []);
print_r($classes);