我是PHP Web Scraper的新手,为此我使用Goutte
和GuzzleHttpClient
,除了https://www.leboncoin.fr/返回403 Forbidden之外,许多网站给出200作为响应。我尝试了许多建议的解决方案,但响应为403,
这是我的最终代码:
<?php
require 'vendor/autoload.php';
$goutteclient=new \Goutte\Client();
$guzzleClient = new \GuzzleHttp\Client();
$resource = $guzzleClient->request('GET', 'https://www.leboncoin.fr/', [
'referer' => true,
'headers' => [
'User-Agent' => 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36',
'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept-Encoding' => 'gzip, deflate, br',
],
]);
// return 403 Forbidden
echo $resource->getStatusCode();