Question

我试图从这个网站获取所有链接： https://www.supremecourt.uk/cases/search-results.html?q=affidavit

使用以下代码：

libxml_use_internal_errors(true);

$html = file_get_contents("https://www.supremecourt.uk/cases/search-results.html?q=affidavit");

$docs = new domDocument; 

$docs->loadHTML($html); 


$anchors = $docs->getElementsByTagName('a');

$links = array();

foreach($anchors as $anchor) {
    echo $links[] = $anchor->getAttribute('href');
    echo '<br>';
}

但返回的链接不包含搜索结果中的链接。为什么会这样，我该如何解决？

Answer 1

此网站上的搜索结果由Google CSE通过JSONP请求提供，可能（不确定，因为我从未尝试过＆＃34;打破＆＃34; CSE但是Google请求签名，因此此任务并不容易肯定）无法从PHP或其他不包含无头浏览器的方式获得，这些浏览器可以执行所有JS操作（PhantomJS，CasperJS，Selenium）。

缺少getElementsByTagName中的元素

1 个答案: