我使用以下代码登录网站
$postData = array("login" => "Prijava", "loginEmail" => "****@****.***", "password" => "*****t", "signonForwardAction" => "/press/cm/si.press.viasat.tv?cc=si&lc=si");
$URL = "http://si.press.viasat.tv/press/cm/1.167?cc=si&lc=si";
$connection = curl_init();
curl_setopt($connection, CURLOPT_URL, $URL);
curl_setopt($connection, CURLOPT_POST, 1);
curl_setopt($connection, CURLOPT_POSTFIELDS, $postData);
curl_setopt($connection, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($connection, CURLOPT_REFERER, "http://si.press.viasat.tv");
curl_setopt($connection, CURLOPT_AUTOREFERER, true);
curl_setopt($connection, CURLOPT_HEADER, true);
curl_setopt($connection, CURLOPT_POSTREDIR, 2);
curl_setopt($connection,CURLOPT_COOKIEJAR, "C:\@DEV\TextALG\cookie.txt");
curl_setopt($connection,CURLOPT_COOKIEFILE, "C:\@DEV\TextALG\cookie.txt");
curl_exec($connection);
curl_close($connection);
问题是(如Firebug中所见)登录后网站重定向到URL(响应:302)。结果我再次登录屏幕。
我得到这样的cookie:
# Netscape HTTP Cookie File
# http://curl.haxx.se/rfc/cookie_spec.html
# This file was generated by libcurl! Edit at your own risk.
si.press.viasat.tv FALSE /press FALSE 0 JSESSIONID 00FC3DCA4CFD806CDEBE2CAA7E999463
有什么想法吗?
答案 0 :(得分:2)
我尝试了不同的逻辑(浏览逻辑,现在可以使用!)
$ch = curl_init();
$randnum = rand(1,5000);
curl_setopt($ch, CURLOPT_COOKIEJAR, "/tmp/cookiejar-$randnum");
curl_setopt($ch, CURLOPT_COOKIEFILE, "/tmp/cookiejar-$randnum");
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POST, 0);
curl_setopt($ch, CURLOPT_URL,$URL);
$page = curl_exec($ch);
preg_match("/action=\"(.*)\"/", $page, $action);
preg_match("/signonForwardAction\" type=\"hidden\" value=\"(.*)\"/", $page, $signonFA);
$action = $action[1];
$signonFA = $signonFA[1];
$postData['signonForwardAction'] = $signonFA;
curl_setopt($ch, CURLOPT_URL,$URL.$action);
curl_setopt($ch, CURLOPT_REFERER, $URL);
curl_setopt($ch, CURLOPT_HTTPHEADER, Array("Content-Type: application/x-www-form-urlencoded"));
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS,http_build_query($postData));
$page = curl_exec($ch);
基本的想法是到网站,设置cookie,而不是发布数据(必须是字符串而不是数组!)到网站(通过$ action获取),然后继续抓取网站!
答案 1 :(得分:0)
尝试添加以下选项
CURLOPT_FOLLOWLOCATION => TRUE, // follow redirects
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects