使用Curl PHP模仿ajax调用

时间:2013-10-06 18:24:39

标签: php ajax curl

我使用curl(通过PHP)抓取网站,我想要的一些信息是产品列表,默认情况下只显示前几个产品。其余部分在用户单击按钮以获取完整产品列表时传递给用户,这会触发ajax调用以返回该列表。

简而言之,他们使用的是JS:

headers['__RequestVerificationToken'] = token;
$.ajax({
type: "post",
url: "/ajax/getProductList",
dataType: 'html',
data: JSON.stringify({ historyPageIndex: 1, displayPeriod: 0, productsType: All }),
contentType: 'application/json; charset=utf-8',
success: function (result) {
    $(target).html("");
    $(target).html(result);
},
beforeSend: function (XMLHttpRequest) {
    if (headers['__RequestVerificationToken']) {
        XMLHttpRequest.setRequestHeader("__RequestVerificationToken", headers['__RequestVerificationToken']);
    }
}
});

这是我的PHP脚本:

curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieLocation);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieLocation);
curl_setopt($ch, CURLOPT_POST, false);
curl_setopt($ch, CURLOPT_URL, 'https://www.domain.com/Applications/ViewProducts');
curl_setopt($ch, CURLOPT_REFERER, 'https://www.domain.com/');
$webpage = curl_exec($ch);
$productsType = trim(find_by_pattren($webpage, '<input id="productsType" name="productsType" type="hidden" value="(.*?)"'));
$token = trim(find_by_pattren($webpage, '<input name="__RequestVerificationToken" type="hidden" value="(.*?)"'));

$postVariables = 'productsType='.$productsType.
'&historyPageIndex=1
&displayPeriod=0
&__RequestVerificationToken='.$token;
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postVariables);
curl_setopt($ch, CURLOPT_URL, 'https://www.domain.com/ajax/getProductList');
curl_setopt($ch, CURLOPT_REFERER, 'https://www.domain.com/Applications/ViewProducts');
$webpage = curl_exec($ch);

这会生成一个包含该站点的错误页面。我认为主要原因可能是:

  • 他们检查是否是ajax请求(不知道如何解决)

  • 令牌需要位于标题中,而不是在帖子变量中

有什么想法吗?

编辑:这是工作代码:

curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieLocation);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieLocation);
curl_setopt($ch, CURLOPT_URL, 'https://www.domain.com/Applications/ViewProducts');
curl_setopt($ch, CURLOPT_REFERER, 'https://www.domain.com/');
$webpage = curl_exec($ch);
$productsType = trim(find_by_pattren($webpage, '<input id="productsType" name="productsType" type="hidden" value="(.*?)"'));
$token = trim(find_by_pattren($webpage, '<input name="__RequestVerificationToken" type="hidden" value="(.*?)"'));

$postVariables = json_encode(array('productsType' => $productsType,
'historyPageIndex' => 1,
'displayPeriod' => 0));
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("X-Requested-With: XMLHttpRequest", "Content-Type: application/json; charset=utf-8", "__RequestVerificationToken: $token"));
curl_setopt($ch, CURLOPT_POSTFIELDS, $postVariables);
curl_setopt($ch, CURLOPT_URL, 'https://www.domain.com/ajax/getProductList');
curl_setopt($ch, CURLOPT_REFERER, 'https://www.domain.com/Applications/ViewProducts');
$webpage = curl_exec($ch);

1 个答案:

答案 0 :(得分:12)

要将请求验证令牌设置为标头,更接近地模仿AJAX请求,并将content-type设置为JSON,请使用CURLOPT_HEADER。

curl_setopt($ch, CURLOPT_HTTPHEADER, array("X-Requested-With: XMLHttpRequest", "Content-Type: application/json; charset=utf-8", "__RequestVerificationToken: $token"));

我还注意到您在代码的第7行上多次将CURLOPT_POST设置为false,并且您发送的帖子数据不是JSON格式。你应该:

$postVariables = '{"historyPageIndex":1,"displayPeriod":0,"productsType":"All"}';