Question

我尝试了question上发布的所有解决方案。虽然它与我的问题类似，但它的解决方案并不适用于我。

我正在尝试获取<b>之外的纯文本，它应位于<div id="maindiv>内。

<div id=maindiv>
     <b>I don't want this text</b>
     I want this text
</div>

$ part 是包含<div id="maindiv">的对象。现在我尝试了这个：

$part->find('!b')->innertext;

上面的代码不起作用。我试过这个时

$part->plaintext;

它返回了所有这样的纯文本

I don't want this text I want this text

我阅读了官方文档，但我找不到任何解决方法：

Answer 1

查询：

$selector->query('//div[@id="maindiv"]/text()[2]')

说明：

//               - selects nodes regardless of their position in tree

div              - selects elements which node name is 'div'

[@id="maindiv"]  - selects only those divs having the attribute id="maindiv"

/                - sets focus to the div element

text()           - selects only text elements

[2]              - selects the second text element (the first is whitespace)

                   Note! The actual position of the text element may depend on
                   your preserveWhitespace setting.

                   Manual: http://www.php.net/manual/de/class.domdocument.php#domdocument.props.preservewhitespace

示例：

$html = <<<EOF
<div id="maindiv">
     <b>I dont want this text</b>
     I want this text
</div>
EOF;

$doc = new DOMDocument();
$doc->loadHTML($html);

$selector = new DOMXpath($doc);   

$node = $selector->query('//div[@id="maindiv"]/text()[2]')->item(0);
echo trim($node->nodeValue); // I want this text

Answer 2

首先删除<b>：

$part->find('b', 0)->outertext = '';
echo $part->innertext; // I want this text

简单的HTML DOM解析器 - 获取所有plaintex而不是某些元素的文本

2 个答案: