如何从具有两个类名称(imdb)的div容器中获取内容

时间:2018-10-24 17:34:30

标签: php xpath domdocument

对不起,我的英语。

大家

当我尝试查询URL的DIV容器中的内容时,我得到一个白页。

$html = file_get_contents('https://www.imdb.com/search/title?title_type=feature,tv_movie&release_date=,2018'); //get the html returned from the following url

$doc = new DOMDocument();

libxml_use_internal_errors(TRUE); //disable libxml errors

if(!empty($html)){ //if any html is actually returned

    $doc->loadHTML($html);
    libxml_clear_errors(); //remove errors for yucky html

    $xpath = new DOMXPath($doc);

    //get all the h2's with an id
    $row = $xpath->query("//div[contains(@class, 'lister-item-image') and contains(@class, 'float-left')]/a");

    if($row->length > 0){
        foreach($row as $row){
            echo $row->nodeValue . "<br/>";
        }
    }
}

可以在此DIV中找到内容。

<div class="lister-item-image float-left">


<a href="/title/tt1502407/?ref_=adv_li_i"
> <img alt="Halloween"
class="loadlate"
loadlate="https://m.media-amazon.com/images/M/MV5BMmMzNjJhYjUtNzFkZi00MWQ4LWJiMDEtYWM0NTAzNGZjMTI3XkEyXkFqcGdeQXVyOTE2OTMwNDk@._V1_UX67_CR0,0,67,98_AL_.jpg"
data-tconst="tt1502407"
height="98"
src="https://m.media-amazon.com/images/G/01/imdb/images/nopicture/large/film-184890147._CB470041630_.png"
width="67" />
</a>        </div>

我主要想查询名称,链接,体裁和长度。最多应显示50个,并查询下一个50个链接“下一步”。

在此先感谢您的帮助。

1 个答案:

答案 0 :(得分:0)

工作版本:

感谢穆罕默德。

$html = file_get_contents('https://www.imdb.com/search/title?title_type=feature,tv_movie&release_date=,2018'); //get the html returned from the following url

$doc = new DOMDocument();

libxml_use_internal_errors(TRUE); //disable libxml errors

if(!empty($html)){ //if any html is actually returned

    $doc->loadHTML($html);
    libxml_clear_errors(); //remove errors for yucky html

    $xpath = new DOMXPath($doc);

    //get all the h2's with an id
    $row = $xpath->query("//div[contains(@class, 'lister-item-image') and contains(@class, 'float-left')]");

    if($row->length > 0){
        foreach($row as $row){
            echo $doc->saveHtml($row) . "<br/>";
        }
    }
}