如何在Symfony2 DomCrawler中保留子节点标记

时间:2013-09-25 15:07:23

标签: php dom symfony domdocument web-crawler

我使用Symfony2 DomCrawler来处理特定节点。

我有一个内部有一些html的DOMDocument。 我基本上正在做的是我正在搜索具有特定类名的<p>标记。

假设我在$ dom对象中有这个html:

<p class="one">class one</p>
<p class="two">class two is the <b>good</b> class</p>
<p class="tree">class tree</p>
<p class="four">class four</p>

我正在使用

$crawler    = new Crawler($dom);
$class      = 'two';
$paragraphs = $crawler->filterXPath('//p');

foreach( $paragraphs as $paragraph ) {
        if ( $paragraph->hasAttribute('class') === false ) {
            continue;
        }

        $class = $paragraph->getAttribute('class');

        if($class == $class_name){
            $node_value = $paragraph->nodeValue;
        }

问题在于,我在这里

class two is the good class

我想得到

class two is the <b>good</b> class

如何在结果中保留这些<b></b>代码?

1 个答案:

答案 0 :(得分:2)

这是因为<b></b>是子节点,而->nodeValue只接受其内容 您需要获取another question

中提到的子节点的内容

此示例适用于您的情况

$dom = <<<'STR'
<p class="one">class one</p>
<p class="two">class two is the <b>good</b> class</p>
<p class="tree">class tree</p>
<p class="four">class four</p>
STR;

$crawler    = new Crawler($dom);
$class_name = 'two';
$paragraphs = $crawler->filterXPath('//p');

foreach ($paragraphs as $paragraph) {
    if (false === $paragraph->hasAttribute('class')) {
        continue;
    }

    $class = $paragraph->getAttribute('class');

    if ($class == $class_name) {
        $value = '';

        foreach ($paragraph->childNodes as $child) {
            $value .= $paragraph->ownerDocument->saveHTML($child);
        }
    }
}

echo $value; // class two is the <b>good</b> class