晚上好。 我正在尝试抓取到PHP保留的jsp区域。 通过Chrome的“高级REST清除”,我可以通过以下方式登录并查看页面内容:
1)GET请求www.website.com 标题为我提供了将用于未来请求的JSESSIONID
2)GET请求www.website.com/login.jsp
3)向www.website.com/i_security_check发送POST请求 发布变量:j_username,j_password,submit,utente
4)GET请求www.website.com/reserved_area.jsp
如果我尝试在PHP上使用curl执行此“算法”,则在第三步我收到以下错误:HTTP状态400 - 对表单登录页面的无效直接引用
这是PHP代码:
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,"https://www.website.com");
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)');
$server_output = curl_exec ($ch);
curl_close ($ch);
$step1 = explode("JSESSIONID=", $server_output);
$step2 = explode("; ", $step1[1]);
$JSESSIONID = $step2[0];
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,"https://www.website.com/login.jsp");
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Cookie: JSESSIONID=".$JSESSIONID));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)');
$server_output = curl_exec ($ch);
curl_close ($ch);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://www.website.com/i_security_check");
curl_setopt($ch, CURLOPT_POST, 1);
$data = array("j_password" => "PASSWORD", "j_username" => "USERNAME", "submit" => "Entra", "utente" =>"USERNAME");
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Cookie: JSESSIONID=".$JSESSIONID, "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Language: it-IT,it;q=0.8,en-US;q=0.5,en;q=0.3", "Connection: keep-alive", "Host: www.website.com", "Referer: https://www.website.com/login.jsp", "User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:37.0) Gecko/20100101 Firefox/37.0"));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$server_output = curl_exec ($ch);
curl_close ($ch);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "www.website.com/reserved_area.jsp");
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Cookie: JSESSIONID=".$JSESSIONID, "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Language: it-IT,it;q=0.8,en-US;q=0.5,en;q=0.3", "Connection: keep-alive", "Host: www.website.com", "User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:37.0) Gecko/20100101 Firefox/37.0"));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$server_output = curl_exec ($ch);
curl_close ($ch);
?>
有什么想法吗? 谢谢
答案 0 :(得分:0)
您不应手动添加JSESSIONID
,而是使用cURL的CURLOPT_COOKIEJAR
和CURLOPT_COOKIEFILE
来处理它。您只需要删除所有手动设置cookie的行(CURLOPT_HTTPHEADER
)并在请求中使用相同的文件:
$cookie_jar = tmpfile();
// first request, where you **receive** cookie
....
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_jar);
// all calls where you need to already have cookie:
...
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_jar);
注意用作CURLOPT_COOKIEJAR
的不同选项用于存储Cookie,CURLOPT_COOKIEFILE
用于投放。您可以安全地同时使用两者:
// each request
....
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_jar);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_jar);
文档:http://php.net/manual/en/function.curl-setopt.php
PS:HTTP代码400代表错误请求,而它应该是401(未授权),但是,服务器端有可能以这种方式工作。