PHP简单的HTML DOM解析

时间:2015-01-13 12:19:06

标签: php html simple-html-dom

我想使用dom解析器从一些html代码中提取一些信息,但我仍然陷入困境。

<div id="posts">
    <div class="post">
        <div class="user">me:</div>
        <div class="post">I am an apple</div>
    </div>
    <div class="post">
        <div class="user">you:</div>
        <div class="post">I am a banana</div>
    </div>
    <div class="post">
        <div class="user">we:</div>
        <div class="post">We are fruits</div>
    </div>
</div>

这将打印用户。

$users= $html->find('div[class=user]');
foreach($users as $user)
    echo $user->innertext;

这将打印帖子。

$posts = $html->find('div[class=post]');
foreach($posts as $post)
    echo $post->innertext;

我想把它们打印在一起,而不是像这样打印出来:

me:
I am an apple
you:
I am a banana
we:
We are fruits

如何使用解析器执行此操作?

3 个答案:

答案 0 :(得分:1)

假设您使用的是Simple HTML DOM Parser,则可以使用逗号分隔符格式的find()。试试这个:

$posts = $html->find('div.post');
foreach($posts as $post){
  $children = $post->find('div.user,div.post');
  foreach($children as $child){
    echo $child->class.' -- ';
    echo $child->innerText(); echo '<br>';
  }
}

输出

user -- me:
post -- I am an apple
user -- you:
post -- I am a banana
user -- we:
post -- We are fruits

答案 1 :(得分:1)

使用您提供的标记,您可以指出主要div(div#posts)的子项,然后循环所有子项。然后为每个孩子获得第一个和第二个:

foreach($html->find('div#posts', 0)->children() as $post) {
    $user = $post->children(0)->innertext;
    $post = $post->children(1)->innertext;
    echo $user . '<br/>' . $post . '<hr/>';
}

虽然我真的建议使用DOMDocument

$dom = new DOMDocument;
$dom->loadHTML($html_markup);
$xpath = new DOMXpath($dom);
$elements = $xpath->query('//div[@id="posts"]/div[@class="post"]');
foreach($elements as $posts) {
    $user = $xpath->evaluate('string(./div[@class="user"])', $posts);
    $post = $xpath->evaluate('string(./div[@class="post"])', $posts);
    echo $user . '<br/>' . $post . '<hr/>';
}

答案 2 :(得分:0)

使用以下代码

$users= $html->find('div[class=user]');
$posts = $html->find('div[class=post]');
foreach($users as $i=>$user){
    echo $user->innertext."<br>";
echo $posts[$i]->innertext;
    }

希望这有助于你