现在,我登录的网站在标题中显示我的用户名,表示我已登录。
现在,当我尝试抓取该网页并在我的m / c上显示结果时,页面标题显示“登录”,表示我需要登录。
我认为我在拼抢中遗漏了一些我需要考虑的cookie信息。
我有什么方法可以阅读cookies。
CURL代码:
function getString( $url ) {
$ch = curl_init();
curl_setopt( $ch, CURLOPT_URL, $url );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, 1 );
curl_setopt( $ch, CURLOPT_AUTOREFERER, true );
curl_setopt( $ch, CURLOPT_COOKIESESSION, true );
curl_setopt( $ch, CURLOPT_COOKIEJAR, 'cookie.txt' );
$response = curl_exec( $ch );
curl_close( $ch );
return $response;
}
答案 0 :(得分:1)
由于Cookie路径的完整路径,您的代码无效,请确保cookie.txt
可写尝试
var_dump(getString("http://google.com"));
function getString($url) {
$ch = curl_init();
$cookie = __DIR__ . '/cookie.txt' ;
touch($cookie);
if(!is_writable($cookie))
{
die("Can't write to cookie");
}
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_COOKIEJAR,$cookie);
curl_setopt($ch, CURLOPT_COOKIEFILE,$cookie);
$response = curl_exec($ch);
curl_close($ch);
return $response;
}
cookie.txt输出
# Netscape HTTP Cookie File
# http://curl.haxx.se/rfc/cookie_spec.html
# This file was generated by libcurl! Edit at your own risk.
.google.com TRUE / FALSE 1411737249 PREF ID=ff7979720d6a1237:FF=0:TM=1348665249:LM=1348665249:S=bRYSIBSW9Cd7PKOr
#HttpOnly_.google.com TRUE / FALSE 1364476449 NID 64=tcm3RUM8R_1ch9eD6tuFi4lObBjSNdxqwMHbpchYCQoUpghIjZbiNw8AdAm0buTAVF0SqUsZsYEs7PAWhJdhutO11EQ9y8iXwuQ9dsPmdWlt86BAa7hxRqQcjSoX9Bep
.google.com.ng TRUE / FALSE 1411737252 PREF ID=9428863ec2e741f5:FF=0:TM=1348665252:LM=1348665252:S=s7wtyWMM9OnRYoE4
#HttpOnly_.google.com.ng TRUE / FALSE 1364476452 NID 64=Gyszb-4_10nzvSU6kGzBj5UQRTnB7purbAH0reBytKi_pn9m3R-0BXGBEjrkmMBmOYfFpfIQOYLaCgi5LfKOIcnPCrTpTpV9LVld-Xf9pq7U7W5QaZ63a_yHIG9Vmcir