Question

我使用PHP Simple Html Dom来获取一些html，现在我有一个跟随代码的html dom，我需要获取纯文本内部div，但是避免使用p标签及其内容（仅返回111111），谁可以提供帮助我？提前谢谢！

<div>
    <p>00000000</p>
    111111
    <p>22222222</p>
</div>

Answer 1

这取决于你的意思＆＃34;避免使用p标签＆＃34;。

如果您只想删除标记，那么只需在其上运行strip_tags()就可以满足您的需求。

如果你真的想要回归＆＃34; 11111＆＃34; （即剥离标签及其内容），这不是一个可行的解决方案。为此，像这样的可以工作：

$myDiv = $html->find('div'); // wherever your the div you're ending up with is
$children = $myDiv->children; // get an array of children
foreach ($children AS $child) {
    $child->outertext = ''; // This removes the element, but MAY NOT remove it from the original $myDiv
}
echo $myDiv->innertext;

Answer 2

如果您的文字始终位于同一位置，请尝试以下操作：

$html->find('text', 2)->plaintext; // should return 111111

Answer 3

这是我的解决方案

我想只获得主要文本部分。

 $title_obj = $article->find(".ofr-descptxt",0); //Store the Original Tree  ie) h3 tag
 $title_obj->children(0)->outertext = ""; //Unset <br/>
 $title_obj->children(1)->outertext = "";  //Unset the last Span
 echo $title_obj; //It has only first element

编辑：如果您有PHP错误尝试用If else附上或尝试我的懒惰代码

   ($title_obj->children(0))?$title_obj->children(0)->outertext="":"";
   ($title_obj->children(1))?$title_obj->children(1)->outertext = "":"";

Official Documentation

Answer 4

$wordlist = array("<p>", "</p>")

foreach($wordlist as $word)
     $string = str_replace($word, "", $string);

PHP Simple Html Dom获取div的纯文本，但避免使用所有其他标记

4 个答案: