DOM解析器:删除空文本节点Vs的标签

时间:2011-07-24 16:01:13

标签: php html xml xpath

我之前有关于删除具有空文本节点的html标记的post

$dom = new DOMDocument();
$dom->loadHtml(
    '<p><strong><a href="http://xx.org.uk/dartmoor-arts">test</a></strong></p>
    <p><strong><a href="http://xx.org.uk/depw"></a></strong></p>
    <p><strong><a href="http://xx.org.uk/devon-guild-of-craftsmen"></a></strong></p>
    <p>this line has a <br/>break</p>
    '
);

$xpath = new DOMXPath($dom);


while(($nodeList = $xpath->query('//*[not(text()) and not(node())]')) && $nodeList->length > 0) {
    foreach ($nodeList as $node) {
        $node->parentNode->removeChild($node);
    }
}


echo $dom->saveHtml();

它完美无缺,但我不希望它删除<br/>标记 - 我该如何保留?

3 个答案:

答案 0 :(得分:7)

使用此XPath(它排除了br个节点):

//*[not(text() or node() or self::br)]

答案 1 :(得分:3)

在删除之前测试$node,例如:

if (!in_array($node->nodeName, array('br'))) {  // add further nodes to keep
  $node->parentNode->removeChild($node);
}

答案 2 :(得分:-3)

尝试将<br/>标记替换为[br/]之类的内容,然后再将其恢复。

足够简单的技巧:)