我正在使用goutte sracper来抓取数据,我收到的错误如InvalidArgumentException - The current node list is empty.
以下是我使用的代码
$string = $crawler->filter('div#links.results')->html();
if ( empty( $string ) )
return false;
$dom = new \DOMDocument;
$state = libxml_use_internal_errors(true);
$dom->loadHTML($string);
libxml_use_internal_errors($state);
$xp = new \DOMXPath($dom);
$divNodeList = $xp->query('//div[contains(@class, "results_links_deep")]
[contains(@class, "web-result")]
/div[contains(@class, "links_main")]
[contains(@class, "links_deep")]
[contains(@class, "result__body")]');
$results = [];
if(count($divNodeList) > 0){
foreach ($divNodeList as $divNode) {
$results[] = [
trim($xp->evaluate('string(./h2/a[@class="result__a"])', $divNode)),
trim($xp->evaluate('string(.//a[@class="result__snippet"])', $divNode)),
trim($xp->evaluate('string(.//a[@class="result__url"])', $divNode))
];
}
}
我尝试使用以下解决方案
if ($crawler->filter('div#links.results')->count() > 0 ) {
$string = $crawler->filter('div#links.results')->html()
}
然后它开始发出另一个错误,如DOMDocument::loadHTML(): Empty string supplied as input
有什么建议吗?
答案 0 :(得分:0)
您的filter
未返回任何结果。这就是崩溃的原因。这就是我如何通过添加try catch来解决这个问题。
try {
$string = $crawler->filter('div#links.results')->html()
} catch (\InvalidArgumentException $e) {
// Handle the current node list is empty..
}