使用CURL发出多个请求时无法保存会话

时间:2018-03-02 09:05:33

标签: php curl captcha

我正在尝试使用CURL打开一个html页面,然后提取验证码图像URL并将图像保存为PNG。我能够做到这两点,但屏幕上显示的图像和保存的图像文件是不同的。我该如何解决这个问题?

//Get page contents first
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,"https://www.gstsearch.in/track-provisional-id.html");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);

curl_setopt($ch, CURLOPT_COOKIESESSION, TRUE);
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookiefile.txt");
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookiefile.txt");


$pageContent = curl_exec ($ch);
$errNo = curl_errno($ch); //CURL error code
curl_close ($ch);


if($errNo == 0) {
    $imgURL = getCaptcha($pageContent); //Get captcha image
    saveCaptcha($imgURL); //Save the captcha image as PNG
}
else {
    $errorMsg = curl_strerror($errNo);
    echo "CURL error ({$errNo}):\n {$errorMsg}";
}




function getCaptcha($html) {
    $dom = new DOMDocument();
    @$dom->loadHTML($html);
    $captchaImg = $dom->getElementById('captchacode');
    $imgSrc = $captchaImg->getAttribute('data-src');

    //URL of the current captcha image
    $imgURL = "https://www.gstsearch.in/{$imgSrc}";
    echo "<img src={$imgURL}>";

    return $imgURL;
}

function saveCaptcha($url) {
    $fp = fopen ("captcha.png", 'w+');

    $sc = curl_init();
    curl_setopt($sc, CURLOPT_URL, $url);
    curl_setopt($sc, CURLOPT_SSL_VERIFYPEER, FALSE);

    curl_setopt($sc, CURLOPT_COOKIEFILE, "cookiefile.txt");
    curl_setopt($sc, CURLOPT_COOKIEJAR, "cookiefile.txt");

    curl_setopt($sc, CURLOPT_FILE, $fp);
    curl_setopt($sc, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($sc, CURLOPT_USERAGENT, 'Mozilla/5.0');
    curl_exec($sc);
    curl_close($sc);
    fclose($fp);
}

更新:我根据建议更新了代码,但仍然发生了同样的事情。我错过了什么?

1 个答案:

答案 0 :(得分:1)

我同意@jeroen,远程站点认为有两个不同的用户:一个发布信息,另一个是检索CAPTCHA:)

您可以使用以下内容存储(并重复使用)session_id

//this is to pass `session_id` between requests
curl_setopt($ch, CURLOPT_COOKIEFILE, $some_path . 'cookie.txt');
//this is to store cookies for future requests, i.e. if you want to retain your session
curl_setopt($ch, CURLOPT_COOKIEJAR, $some_path . 'cookie.txt');

你应该将这些用于两个请求。这种方式网站会认为你是同一个用户,但不是两个不同的(正如它现在所想的那样)