使用DOM获取HTML代码。

时间:2014-05-19 04:49:02

标签: php dom

我正在从html来源收集用户名,标题,视频计数和评论数。 Consider this link

以下是我的代码:

function getParameter($url)
{
    $html = file_get_html($url);
    if($html)
    {
            $containers1 = $html->find('div.v div.v-link v-meta-entry');
            foreach($containers1 as $container)
            {               
                $plays = $container->find('v-num'); // get nos of time video played
                $item = new stdClass();
                foreach($plays as $play)
                {
                    $nos = $play->plaintext; 
                }
                //echo $address;
            }
             $containers2 = $html->find('div.v div.v-link v-meta va a'); //get user name
            foreach($containers2 as $username)
            {
                $user = $username->plaintext;
            }
             $containers3 = $html->find('div.v div.v-link a'); //get video title
            foreach($containers3 as $title)
            {
                $title = $title->plaintext;
            }
            $commentcontainers = $html->find('div.ico-statcomment span'); //get nos of comments
            foreach($commentcontainer as $cont)
            {
                $comments = $cont->plaintext;
            }
            return $data;               
    }
}

但它给出了这个错误:

Warning: Invalid argument supplied for foreach() in /var/www/html/scrap/yelp/yoku.php on line 41 

有任何提示吗?

以下是源代码段:

<div class="v" >

    <div class="v-thumb">
        <img src="http://r1.ykimg.com/0515000052B7B3D46714C0318106EA36" alt="和小姚晨约会激动抽筋" />
                            <div class="v-thumb-tagrb"><span class="v-time">08:56</span></div>
            </div>
    <div class="v-link">
        <a href="http://v.youku.com/v_show/id_XNjQ0MTAyMzQ0.html" target="video" title="和小姚晨约会激动抽筋"></a>
    </div>
        <div class="v-meta va">
                            <div class="v-meta-neck">
                        <a class="v-useravatar" href="http://i.youku.com/u/UMTQxNDg3NzA4"  target="_blank"><img title="嘻哈四重奏" src="http://g3.ykimg.com/0130391F455211D40E180C021BBB97D37C4057-26F4-2F53-A8CD-4A4139415701"></a>
                        <span class="v-status">&nbsp;</span>
                        <a class="v-username" href="http://i.youku.com/u/UMTQxNDg3NzA4"  target="_blank">嘻哈四重奏</a>
                    </div>
                        <div class="v-meta-title"><a href="http://v.youku.com/v_show/id_XNjQ0MTAyMzQ0.html" target="video">和小姚晨约会激动抽筋</a></div>
                                <div class="v-meta-entry">
            <i class="ico-statplay" title="播放"></i><span class="v-num">588万</span>&nbsp;<i title="评论" class="ico-statcomment"></i><span class="v-num">1,290</span>      </div>
                        <div class="v-meta-overlay"></div>
            </div>
    </div>

1 个答案:

答案 0 :(得分:1)

检查选择器,文档simplehtmldom。例如,我在选择器中做了一些更改。

function getParameter($url)
{
    $html = file_get_html($url);
    if($html)
    {   
        //we iterate all 'div.v' and select data from every 'div.v' separately
        $containersDiv = $html->find('div.v'); 
        foreach($containersDiv as $div) 
        {
            $containers1 = $div->find('div[class=v-meta va] div.v-meta-entry'); 
            foreach($containers1 as $container)
            {               
                $plays = $container->find('.v-num'); // get nos of time video played
                $item = new stdClass();
                foreach($plays as $play)
                {
                    $nos = $play->plaintext; 
                }
                //echo $address;
            }
             $containers2 = $div->find('div[class=v-meta va] a'); //get user name
            foreach($containers2 as $username)
            {
                $user = $username->plaintext;
            }
             $containers3 = $div->find('div.v-link a'); //get video title
            foreach($containers3 as $title)
            {
                $title = $title->plaintext;
            }
            $commentcontainers = $div->find('div[class=v-meta va] div.v-meta-entry span'); //get nos of comments changed
            foreach($commentcontainer as $cont)
            {
                $comments = $cont->plaintext;
            }
        }

        return $data;               
    }
}