简单的Html Dom Parser问题

时间:2014-04-22 23:08:20

标签: php html-parsing simpledom

我试图了解simple_html_dom的工作原理并从特定类别中获取一些数据。 所以我在一个wp网站上做了一些测试。

我从官方网站上获取了代码并进行了一些修改以适应:

// Create DOM from URL
$html = file_get_html('http://localhost/mydemosite/category/sports/');

// Find all article blocks
foreach($html->find('div.article') as $article) {
    $item['title']   = $article->find('div.post-title', 0)->plaintext;
    $item['thumb']   = $article->find('div.post-thumbnail', 0)->plaintext;
    $item['details'] = $article->find('div.entry', 0)->plaintext;
    $articles[] = $item;
}

print_r($articles);

当我跑步时,我收到了一个错误:

Notice: Undefined variable: articles in C:\xampp\htdocs\mydemosite\test.php on line 28

第28行是print_r($articles);

我的结构是:

<article class="item-list item_1">
<h2 class="post-title"><a href="http://localhost/mydemosite/category/sports/demo-post" title="mydemo post" rel="bookmark">my demo post 1</a></h2>
<p class="post-meta">
<span class="tie-date">2 mins ago</span>    
<span class="post-comments">
<a href="http://localhost/mydemosite/category/sports/demo-post/#disqus_thread" title="my demo post 1" data-disqus-identifier="1 http://localhost/mydemosite/category/sports/?p=1"></a></span>
</p>
<div class="post-thumbnail">
<a href="http://localhost/mydemosite/category/sports/demo-post/" title="my demo post 1" rel="bookmark">
<img width="300" height="160" src="http://localhost/mydemosite/wp-content/uploads/demo-post-300x160.jpg" class="attachment-tie-large wp-post-image" alt="my demo post 1">
</a>
</div>
<!-- post-thumbnail /-->
<div class="entry">
<p>Hello world... this is a demo post description, so if you want to read more...</p>
<a class="more-link" href="http://localhost/mydemosite/category/sports/demo-post">Read More »</a>
</div>
<div class="clear"></div>
</article>

1 个答案:

答案 0 :(得分:1)

变量$ articles在for循环中声明,因此您有一个范围问题。尝试:

// Create DOM from URL
$html = file_get_html('myurl');
$articles = array()
// Find all article blocks
foreach($html->find('div.article') as $article) {
    $item['title']   = $article->find('div.post-title', 0)->plaintext;
    $item['thumb']   = $article->find('div.post-thumbnail', 0)->plaintext;
    $item['details'] = $article->find('div.entry', 0)->plaintext;
    $articles = $item;
}

print_r($articles);