我正在打一个他们有JavaScript加载代码的网站。启动脚本时,我只收集已经加载的数据,但不等待它加载整个页面。
我尝试从trivago的网络中收集某个div中显示的数据以在我的后端进行统计,该网络中有一个加载器。为此,我启动以下代码:
include_once('simple_html_dom.php');
function getHTML($url,$timeout){
$ch = curl_init($url); // initialize curl with given url
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER["HTTPS_USER_AGENT"]); // set
useragent
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // write the response to a variable
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // follow redirects if any
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); // max. seconds to execute
curl_setopt($ch, CURLOPT_FAILONERROR, 1); // stop when it encounters an error
return @curl_exec($ch);
}
$html = file_get_html("https://www.trivago.es/?aDateRange%5Barr%5D=13-04-2019&aDateRange%5Bdep%5D=14-04-2019&iPathId=82650&iGeoDistanceItem=0&aCategoryRange=0%2C1%2C2%2C3%2C4%2C5&aOverallLiking=1%2C2%2C3%2C4%2C5&sOrderBy=relevance%20desc&iRoomType=7&cpt=8265003&iViewType=0&bIsSeoPage=false&bIsSitemap=false&");
我尝试使用find()函数收集数据:
foreach($html->find("div.item__flex-column") as $seccion) {
echo "<tr>";
echo "<td>";
echo $seccion->find("h3",0)->plaintext;
echo "</td>";
echo "<td>";
echo $seccion->find("p.details__paragraph",0)->plaintext;
echo "</td>";
echo "<td>";
echo $seccion->find("strong.item__best-price",0)->plaintext;
echo "</td>";
echo "<td style='text-decoration:line-through;'>";
echo $seccion->find($fmp,0)->plaintext;
echo "</td>";
echo "</tr>";
}
我得到的错误:
致命错误:未捕获错误:调用字符串上的成员函数find()
在加载整个页面之前,有没有办法停止PHP程序?
答案 0 :(得分:0)
尝试
echo $seccion->find("h3")[0]->plaintext;