我的代码
<?php
$url='Search.jsp';
// disguises the curl using fake headers and a fake user agent.
function disguise_curl($url)
{
$curl = curl_init();
// Setup headers - I used the same headers from Firefox version 2.0.0.6
// below was split up because php.net said the line was too long. :/
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: "; // browsers keep this blank.
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_USERAGENT, 'Googlebot/2.1 (+http://www.google.com/bot.html)');
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl, CURLOPT_REFERER, 'https://lalpacweb.blackpool.gov.uk/protected/wca/publicRegisterVehicleSearch.jsp');
curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($curl, CURLOPT_AUTOREFERER, 1);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_COOKIESESSION, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_COOKIEJAR, "cookies.txt");
curl_setopt($curl, CURLOPT_COOKIEFILE, "cookies.txt");
curl_setopt($curl, CURLOPT_HEADER, 1);
curl_setopt( $curl, CURLOPT_POST, 1);
curl_setopt ($curl, CURLOPT_POSTFIELDS, 'search.licenceTypeID=34&search.licenceLinkFileID=2&search.vehicleRegNumber=5&publicRegisterVehicle=Search');
$html = curl_exec($curl); // execute the curl command
echo curl_getinfo($curl, CURLINFO_HTTP_CODE);
curl_close($curl); // close the connection
return $html; // and finally, return $html
}
// uses the function and displays the text off the website
$text = disguise_curl($url);
echo $text;
?>
它返回页面,表单已填写,但不会发布。我得到的curl_getinfo响应是..
200HTTP / 1.1 200 OK Pragma:no-cache Cache-Control: no-cache,no-store,must-revalidate Expires:Thu,01 Jan 1970 00:00:00 GMT内容类型:text / html; charset = ISO-8859-1内容 - 语言:en-GB 内容长度:5901日期:2012年2月19日星期日12:24:08 GMT服务器: Apache的
有什么想法吗?
感谢您的帮助
答案 0 :(得分:3)
您可能想要做的事情有一些,首先我相信如果您提供cookiejar的绝对路径,它在不同操作系统中的效果会更好:
curl_setopt($curl, CURLOPT_COOKIEJAR, dirname(__FILE__) . "/cookies.txt");
curl_setopt($curl, CURLOPT_COOKIEFILE, dirname(__FILE__) . "/cookies.txt");
此外,您可以让脚本首先访问主页以获取会话cookie:
disguise_curl("https://lalpacweb.blackpool.gov.uk");
然后您可以将表单发布到https://lalpacweb.blackpool.gov.uk/protected/actions/PublicRegister.action
(确保cookies.txt存在):
<?php
// disguises the curl using fake headers and a fake user agent.
function disguise_curl($url, $post = false)
{
$curl = curl_init();
// Setup headers - I used the same headers from Firefox version 2.0.0.6
// below was split up because php.net said the line was too long. :/
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: "; // browsers keep this blank.
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_USERAGENT, 'Googlebot/2.1 (+http://www.google.com/bot.html)');
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl, CURLOPT_REFERER, 'https://lalpacweb.blackpool.gov.uk/protected/wca/publicRegisterVehicleSearch.jsp');
curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($curl, CURLOPT_AUTOREFERER, 1);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_COOKIESESSION, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_COOKIEJAR, dirname(__FILE__) . "/cookies.txt");
curl_setopt($curl, CURLOPT_COOKIEFILE, dirname(__FILE__) . "/cookies.txt");
curl_setopt($curl, CURLOPT_HEADER, 1);
if ($post)
{
curl_setopt( $curl, CURLOPT_POST, 1);
curl_setopt ($curl, CURLOPT_POSTFIELDS, 'search.licenceTypeID=34&search.licenceLinkFileID=2&search.vehicleRegNumber=5&publicRegisterVehicle=Search');
}
$html = curl_exec($curl); // execute the curl command
//echo curl_getinfo($curl, CURLINFO_HTTP_CODE);
curl_close($curl); // close the connection
return $html; // and finally, return $html
}
// Visit the home-page first to get the session cookie
disguise_curl("https://lalpacweb.blackpool.gov.uk");
// uses the function and displays the text off the website
$url = 'https://lalpacweb.blackpool.gov.uk/protected/actions/PublicRegister.action';
$text = disguise_curl($url, true);
echo $text;
?>
答案 1 :(得分:1)
使用浏览器打开https://lalpacweb.blackpool.gov.uk/protected/wca/publicRegisterVehicleSearch.jsp时,我被重定向到https://lalpacweb.blackpool.gov.uk/sessiontimeout.jsp并出现“会话超时”错误。也许你必须提出两个请求。一个登录(并可能获得会话cookie)和一个实际执行搜索。 curl应自动发送在同一会话中之前请求中收到的cookie。否则设置curl_setopt($curl, CURLOPT_COOKIE, 'CookieName=CookieValue');
。
答案 2 :(得分:0)
$post = urlencode('search.licenceTypeID=34&search.licenceLinkFileID=2&search.vehicleRegNumber=5&publicRegisterVehicle=Search');
或
$post = array(
'search.licenceTypeID' => 34,
'search.licenceLinkFileID' => 2,
'search.vehicleRegNumber' => 5,
'publicRegisterVehicle' => 'Search'
)
curl_setopt ($init, CURLOPT_POSTFIELDS, $post);