请问我的英语。
我使用Rollingcurl抓取各种页面。
滚动卷曲:https://github.com/LionsAd/rolling-curl
我的课:
<?php
class Imdb
{
private $release;
public function __construct()
{
$this->release = "";
}
// SEARCH
public static function most_popular($response, $info)
{
$doc = new DOMDocument();
libxml_use_internal_errors(true); //disable libxml errors
if (!empty($response)) {
//if any html is actually returned
$doc->loadHTML($response);
libxml_clear_errors(); //remove errors for yucky html
$xpath = new DOMXPath($doc);
//get all the h2's with an id
$row = $xpath->query("//div[contains(@class, 'lister-item-image') and contains(@class, 'float-left')]/a/@href");
$nexts = $xpath->query("//a[contains(@class, 'lister-page-next') and contains(@class, 'next-page')]");
$names = $xpath->query('//img[@class="loadlate"]');
// NEXT URL - ONE TIME
$Count = 0;
$next_url = "";
foreach ($nexts as $next) {
$Count++;
if ($Count == 1) {
/*echo "Next URL: " . $next->getAttribute('href') . "<br/>";*/
$next_link = $next->getAttribute('href');
}
}
// RELEASE NAME
$rls_name = "";
foreach ($names as $name) {
$rls_name .= $name->getAttribute('alt');
}
// IMDB TT0000000 RLEASE
if ($row->length > 0) {
$link = "";
foreach ($row as $row) {
$tt_info .= @get_match('/tt\\d{7}/is', $doc->saveHtml($row), 0);
}
}
}
$array = array(
$next_link,
$rls_name,
$tt_info,
);
return ($array);
}
}
输出/返回:
$array = array(
$next_link,
$rls_name,
$tt_info,
);
return ($array);
致电:
<?php
error_reporting(E_ALL);
ini_set('display_errors', 1);
function get_match($regex, $content, $pos = 1)
{
/* do your job */
preg_match($regex, $content, $matches);
/* return our result */
return $matches[intval($pos)];
}
require "RollingCurl.php";
require "imdb_class.php";
$imdb = new Imdb;
if (isset($_GET['action']) || isset($_POST['action'])) {
$action = (isset($_GET['action'])) ? $_GET['action'] : $_POST['action'];
} else {
$action = "";
}
echo " 2222<br /><br />";
if ($action == "most_popular") {
$popular = '&num_votes=1000,&production_status=released&groups=top_1000&sort=moviemeter,asc&count=40&start=1';
if (isset($_GET['date'])) {
$link = "https://www.imdb.com/search/title?title_type=feature,tv_movie&release_date=,".$_GET['date'].$popular;
} else {
$link = "https://www.imdb.com/search/title?title_type=feature,tv_movie&release_date=,2018".$popular;
}
$urls = array($link);
$rc = new RollingCurl([$imdb, 'most_popular']); //[$imdb, 'most_popular']
$rc->window_size = 20;
foreach ($urls as $url) {
$request = new RollingCurlRequest($url);
$rc->add($request);
}
$stream = $rc->execute();
}
如果我在课堂上将所有内容输出为“ echo”,则还将显示所有内容。但是,我想单独调用所有内容。
如果我现在尝试像这样输出它,它将不起作用。
$stream[0]
$stream[1]
$stream[3]
有人知道这可能如何工作吗? 预先非常感谢。
答案 0 :(得分:1)
RollingCurl
对回调的返回值不做任何事情,也不将其返回给调用者。有回调函数时,$rc->execute()
仅返回true
。如果要保存任何内容,则需要在回调函数本身中进行保存。
您应该使most_popular
为非静态函数,并为其提供一个属性$results
,该属性将在构造函数中初始化为[]
。然后它可以执行以下操作:
$this->results[] = $array;
完成之后
$rc->execute();
您可以这样做:
foreach ($imdb->results as $result) {
echo "Release name: $result[1]<br>TT Info: $result[2]<br>";
}
最好将从文档中提取的数据放入数组中,而不要使用串联的字符串,例如
$this->$rls_names = [];
foreach ($names as $name) {
$this->$rls_names[] = $name->getAttribute('alt');
}
$this->$tt_infos = [];
foreach ($rows as $row) {
$this->$tt_infos[] = @get_match('/tt\\d{7}/is', $doc->saveHtml($row), 0);
}
$this->next_link = $next[0]->getAttribute('href'); // no need for a loop to get the first element of an array