如何只获取某些DOMNode的头尾文本?

时间:2017-12-07 10:23:36

标签: php xml domdocument

如何在DOMDocument中只获取节点的头尾文本?

示例,在此示例代码中,我不希望看到标记的内容:     

$dom = new DOMDocument();
$dom->loadXML('<?xml version="1.0" encoding="UTF-8"?>
<root>
  <test>head text <b>some bold text</b> tail text</test>
</root>
');

foreach ($dom->getElementsByTagName('test') as $node) {
    echo 'nodeValue: '.$node->nodeValue."\n";
    echo 'textContent:'.$node->textContent."\n";
}

2 个答案:

答案 0 :(得分:1)

您必须遍历每个节点并只查找文本(DOMText)的子节点,可以忽略任何其他节点...

$dom = new DOMDocument();
$dom->loadXML('<?xml version="1.0" encoding="UTF-8"?>
<root>
  <test>head text <b>some bold text</b> tail text</test>
</root>
');

foreach ($dom->getElementsByTagName('test') as $node) {
    foreach ( $node->childNodes as $sub )   {
        if ( $sub instanceof DOMText )  {
            echo 'nodeValue: '.$sub->nodeValue."\n";
            echo 'textContent:'.$sub->textContent."\n";
        }
    }
}

给你......

nodeValue: head text 
textContent:head text 
nodeValue:  tail text
textContent: tail text

答案 1 :(得分:0)

作为替代方案,您还可以使用DOMXPath和xpath表达式来获取文本:

$dom = new DOMDocument();
$dom->loadXML('<?xml version="1.0" encoding="UTF-8"?>
<root>
  <test>head text <b>some bold text</b> tail text</test>
</root>
');

$xpath = new DOMXPath($dom);
$elements = $xpath->query('/root/test/text()');

foreach ($elements as $element) {
    echo $element->nodeValue;
}

Demo