我试图用分页来解析电影网站。我想解析第1页上的所有电影项目,当它完成时我希望解析器继续下一页。我编写了一个可以工作的解析器,但它不解析页面上的所有电影项目,也不会在另一个页面上继续。我想检测何时完成一个结果的解析并使其在下一个项目上移动。然后检测何时解析所有电影项目并使其在下一页上移动。我希望当我运行解析器时,它应该逐个显示电影标题,年份等,然后继续下一页。目前它仅显示/解析第1页上的一个电影项目,并且不继续工作。这是我的代码和示例:
解析示例:http://minerbitco.in/parse/parse.php
<?php
include_once 'simple_html_dom.php';
$page = (!isset($_GET['page'])) ? 1 : $_GET['page'];
echo '<br> Parsing Page #'.$page.'<br><br>';
$html = file_get_html('https://srulad.com/movies/type/movie#page-'.$page);
$obj = $html->find('div.movie_item');
$datas = [];
if($obj){
foreach ($obj as $key => $data) {
$movie_url = 'https://srulad.com/'.$data->find('div.poster a', 0)->href;
$html2 = file_get_html($movie_url);
$item['url'] = $movie_url;
$item['year'] = $html2->find('#movie_content > div', 0)->children(2)->find('div', 0)->children(0)->children(1)->plaintext;
$item['genre'] = $html2->find('#movie_content > div', 0)->children(1)->find('span', 0)->plaintext;
$item['description'] = $html2->find('#movie_content > div', 0)->children(1)->find('div.plot', 0)->plaintext;
$item['imdb_rating'] = $html2->find('#movie_content > div', 0)->children(2)->find('div', 0)->children(1)->children(1)->find('span', 0)->plaintext;
$item['englishtitle'] = $html2->find('#movie_content > div', 0)->children(1)->find('h2.newmt', 0)->plaintext;
$item['geotitle'] = $html2->find('#movie_content > div', 0)->children(1)->find('h3.newmt', 0)->plaintext;
$item['poster'] = $html2->find('#movie_content > div', 0)->children(0)->find('img', 0)->src;
$url = $item['url'];
$year = $item['year'];
$desc = $item['description'];
$rating = $item['imdb_rating'];
$poster = $item['poster'];
$engtitle = $item['englishtitle'];
$geotitle = $item['geotitle'];
$genre = $item['genre'];
}}
if ($data === end($obj)) {
echo '<META http-equiv="refresh" content="10;URL=#page-'.($page+1).'">';
}
else {
echo "dasrulebulia.";
}
echo 'URL: '.$url.'<br>';
echo 'პოსტერის URL: '.$poster.'<br>';
echo 'სათაური ინგლისურად: '.$engtitle.'<br>';
echo 'სათაური ქართულად: '.$geotitle.'<br>';
echo 'წელი:'.$year.'<br>';
echo 'ჟანრი:'.$genre.'<br>';
echo 'აღწერა:'.$desc.'<br>';
echo 'რეიტინგი:'.$rating.'<br>';
?>
答案 0 :(得分:0)
你可以尝试一下我写过的Parser:
https://github.com/sachinsinghshekhawat/simple-html-dom-parser-php