Question

我需要在div之间提取文本（“四分之三......”） - 使用Simple HTML Dom PHP库。

我已经尝试过所有想法！ next_sibling()返回评论，并且 next_sibling()->next_sibling()会返回<br/>标记。理想情况下，我希望从第一条评论的末尾和下一个</div>标记中获取所有文本。

<div class="left">
Bla-bla..
<div class="float">Bla-bla...
</div><!--/end of div.float-->
    <br />The third of four performances in the Society's Morning Melodies series features...<a href='index.php?page=tickets&month=20140201'>&lt;&lt; Back to full event listing</a>
</div><!--/end of div.left-->

下面打印 - 评论标记。

//find content that follows div with a class float. There is a comment in between.
$div_float = $html->find("div.float");
$betweendivs =  $div_float[0]->next_sibling();
$actual_content = $betweendivs ->outertext ;
echo $actual_content;

我的下一步是获取div.left的innertext，然后删除其中的所有div，但这似乎是一个很大的麻烦。我能做些什么容易吗？

Answer 1

使用find('text', $index)获取所有文本块，其中$index是所需文本的索引...

所以在这种情况下，它是：

echo $html->find('text', 3);

// OUTPUT:
The third of four performances in the Society's Morning Melodies series features...

您可以在Manual

中阅读更多内容

修改

这是一个有效的代码：

$input = '<div class="left"> Bla-bla.. <div class="float">Bla-bla... </div> <br />The third of four performances in the Society\'s Morning Melodies series features...<a href="index.php?page=tickets&month=20140201"><< Back to full event listing</a> </div>'; //Create a DOM object $html = new simple_html_dom(); // Load HTML from a string $html->load($input); // Using $index echo $html->find('text', 3); echo "<hr>"; // Or, it's the 3rd element starting from the end $text = $html->find('text'); echo $text[count($text)-3]; // Clear DOM object $html->clear(); unset($html); // OUTPUT The third of four performances in the Society's Morning Melodies series features... The third of four performances in the Society's Morning Melodies series features...

Working DEMO

Answer 2

为什么不在div.class上使用 - ＆gt; plaintext？它根据需要输出文本。

$html->find("div[class=left]")->plaintext;

马尔蒂阿赫

Answer 3

我实际上认为Simple HTML Dom不提供工具来执行此操作，因为没有“get before”或“get after”类型的命令。如果我错了，请告诉我。

简单的HTML Dom - 在div之间查找文本

3 个答案: