我正在使用Goutte(使用Guzzle)来提取内容,但我的脚本以错误结束,尽管我在try / catch中运行:
Error: Client error: `GET http://example.com/C42C9CA3` resulted in a `403 Forbidden` response:
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"htt (truncated...)
这就是我所拥有的:
use Goutte\Client;
$HTTPconfig = [ "curl" => [
CURLOPT_TIMEOUT => 60,
CURLOPT_CONNECTTIMEOUT => 60,
CURLOPT_SSL_VERIFYPEER => false,
],
['http_errors' => false]
];
$HTTPclient = new \Goutte\Client;
$HTTPclient->setClient(new \GuzzleHttp\Client($HTTPconfig));
$HTTPclient->setHeader('user-agent', 'Mozilla/5.0 (Windows NT 6.2; rv:20.0) Gecko/20121202 Firefox/20.0');
try {
$crawler = $HTTPclient->request('GET', $url);
$doc = $crawler->html();
} catch (Exception $e) {
write($e->getMessage());
continue;
}
答案 0 :(得分:3)
尝试:
} catch (\Exception $e) {
而不是:
} catch (Exception $e) {
编辑:
如果您使用的是PHP-7,可以尝试使用斜杠捕获Throwable,如下所示:
} catch (\Throwable $e) {
希望这个帮助
答案 1 :(得分:0)
删除['http_errors' => false]
选项。默认情况下为true
,结果为4xx / 5xx响应代码例外。