好的,我有一些网站应该解析... 首先,我在Firefox中按F12打开调试器,然后查看“网络”标签,然后输入所需的网站,并读取第一个根GET请求,例如
Doman => website.com
File => /
我到达所有请求标头,然后将它们手动写入php数组,然后在我调用的代码中
curl_setopt($curl, CURLOPT_HTTPHEADER, $headerArray);
以及其他选项,然后致电
curl_exec();
在Firefox中检查“网络”选项卡时,我看到请求标头可能是默认标头,并且没有发送手动写入数组的特定标头。与CURLOPT_COOKIEFILE和CURLOPT_COOKIEJAR相似的问题是,cookie只是写入服务器上的cookie文件,但实际上,下一个请求中还有另一个cookie,而不是以前保存在cookie文件中。
浏览器检查器中的实际请求标头:
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding: gzip, deflate
Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.5,en;q=0.3
Cache-Control: max-age=0
Connection: keep-alive
Cookie: _ga=GA1.1.1951751996.1563984714; _gid=GA1.1.1564173251.1563984714; _userGUID=0:jyhg490v:AIQdD2Qpm9rmbla1U93mK2a45CFRe49c; jv_enter_ts_2VumZAPpbr=1563984717382; jv_visits_count_2VumZAPpbr=1; .....
Host: localhost
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Firefox/68.0
PHP代码:
<?php
$headers = ['Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.5,en;q=0.3',
'Cache-Control: max-age=0',
'Connection: keep-alive',
'Cookie: visid_incap_1987259....,
'Host: website.com',
'TE: Trailers',
'Upgrade-Insecure-Requests: 1',
'User-Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)'];
$curl = curl_init("https://www.website.com/");
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);
curl_setopt($curl, CURLOPT_COOKIEFILE, dirname(__FILE__)."/cookies.txt");
curl_setopt($curl, CURLOPT_COOKIEJAR, dirname(__FILE__)."/cookies.txt");
echo curl_exec($curl);
?>
答案 0 :(得分:0)
您将无法在浏览器开发工具中看到标头发送CURL。所有请求都在服务器端执行。标头已成功发送。您可以像这样检查它:
curl_setopt($curl, CURLINFO_HEADER_OUT, true);
$sentHeaders = curl_getinfo($curl, CURLINFO_HEADER_OUT);
print_r($sentHeaders);