如何等待脚本使用PHP Simple HTML DOM解析器加载页面?

时间:2019-04-13 17:15:24

标签: php html

我正在打一个他们有JavaScript加载代码的网站。启动脚本时,我只收集已经加载的数据,但不等待它加载整个页面。

我尝试从trivago的网络中收集某个div中显示的数据以在我的后端进行统计,该网络中有一个加载器。为此,我启动以下代码:

include_once('simple_html_dom.php');

function getHTML($url,$timeout){
  $ch = curl_init($url); // initialize curl with given url
  curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER["HTTPS_USER_AGENT"]); // set  
  useragent
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // write the response to a variable
  curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // follow redirects if any
  curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); // max. seconds to execute
  curl_setopt($ch, CURLOPT_FAILONERROR, 1); // stop when it encounters an error
  return @curl_exec($ch);
}

$html = file_get_html("https://www.trivago.es/?aDateRange%5Barr%5D=13-04-2019&aDateRange%5Bdep%5D=14-04-2019&iPathId=82650&iGeoDistanceItem=0&aCategoryRange=0%2C1%2C2%2C3%2C4%2C5&aOverallLiking=1%2C2%2C3%2C4%2C5&sOrderBy=relevance%20desc&iRoomType=7&cpt=8265003&iViewType=0&bIsSeoPage=false&bIsSitemap=false&");

我尝试使用find()函数收集数据:

foreach($html->find("div.item__flex-column") as $seccion) {
  echo "<tr>";
    echo "<td>";
      echo $seccion->find("h3",0)->plaintext;
    echo "</td>";
    echo "<td>";
      echo $seccion->find("p.details__paragraph",0)->plaintext;
    echo "</td>";
    echo "<td>";
      echo $seccion->find("strong.item__best-price",0)->plaintext; 
    echo "</td>";
    echo "<td style='text-decoration:line-through;'>";
      echo $seccion->find($fmp,0)->plaintext; 
    echo "</td>";
  echo "</tr>";
}

我得到的错误:

  

致命错误:未捕获错误:调用字符串上的成员函数find()

在加载整个页面之前,有没有办法停止PHP程序?

1 个答案:

答案 0 :(得分:0)

尝试

echo $seccion->find("h3")[0]->plaintext;