PHP cURL登录网站并在不同页面

时间:2016-05-23 21:03:30

标签: php curl

我遇到了一些问题。我有以下代码,它使用cURL登录网站,发布所有必需的数据

<?php

$url = "https://someurl/login.aspx";
$ckfile = tempnam("/tmp", "CURLCOOKIE");
$useragent = $_SERVER['HTTP_USER_AGENT'];

$username = "someuser@hotmail.co.uk";
$password = "somepassword";

$f = fopen('log.txt', 'w'); 

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_COOKIEJAR, $ckfile);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);

$html = curl_exec($ch);

curl_close($ch);

preg_match('~<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="(.*?)" />~', $html, $viewstate);
preg_match('~<input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="(.*?)" />~', $html, $eventValidation);

$viewstate = $viewstate[1];
$eventValidation = $eventValidation[1];

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, false);
curl_setopt($ch, CURLOPT_COOKIEJAR, $ckfile);
curl_setopt($ch, CURLOPT_COOKIEFILE, $ckfile);
curl_setopt($ch, CURLOPT_HEADER, FALSE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_REFERER, $url);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_STDERR, $f);
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);

$postfields = array();
$postfields['__EVENTTARGET'] = "";
$postfields['__EVENTARGUMENT'] = "";
$postfields['__VIEWSTATE'] = $viewstate;
$postfields['__EVENTVALIDATION'] = $eventValidation;
$postfields['btnLogin'] = "Login";
$postfields['txtPassword'] = $password;
$postfields['txtUserName'] = $username;

curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postfields);
$ret = curl_exec($ch);

所以上面的代码工作正常。我还没有关闭curl因为我需要保持cookie活跃。无论如何,一旦我使用上面的登录,我就

if($ret) {
    curl_setopt($ch, CURLOPT_URL, 'https://someurl.com/financial/quote.aspx?id=12345');
    curl_setopt($ch, CURLOPT_POST, 0);

    $data = curl_exec($ch);
    var_dump($data);
}

我可以从输出中看到我现在在正确的页面上。但是,在上面的例子中,我没有发布任何内容。我去的页面上也有一个按钮。在Firebug中查看此按钮发布的内容时,我看到了这个

__EVENTARGUMENT 
__EVENTTARGET   
__EVENTVALIDATION   fsudifhsiudgfiusgdf
__VIEWSTATE 
__VIEWSTATE_GUID    0f26cc24-ef59-4bc7-87c0-141833df148b
ctl00$PageContent$btn2  Accepted

因此,我尝试通过执行以下操作来复制按下此按钮

if($ret) {
    curl_setopt($ch, CURLOPT_URL, 'https://someurl.com/financial/quote.aspx?id=12345');
    $postfieldsInner = array();
    $postfieldsInner['__EVENTTARGET'] = "";
    $postfieldsInner['__EVENTARGUMENT'] = "";
    $postfieldsInner['__VIEWSTATE'] = "";
    $postfieldsInner['__VIEWSTATE_GUID'] = $viewstate;
    $postfieldsInner['__EVENTVALIDATION'] = $eventValidation;
    $postfieldsInner['ctl00$PageContent$btn2'] = "Accepted";

    curl_setopt($ch, CURLOPT_POST, 1);
    curl_setopt($ch, CURLOPT_POSTFIELDS, $postfieldsInner);

    $content = curl_exec($ch);

    if (!$content) {
        echo 'An error has occurred: ' . curl_error($ch);
    } else {
        var_dump($content);
    }
}

但是,这次按钮动作似乎没有发生。在发出第二个请求时我是否遗漏了什么?

由于

1 个答案:

答案 0 :(得分:1)

所以,有几件事情在我身上跳出来:

  • 您认为__VIEWSTATE__EVENTVALIDATION的神奇值不会发生变化。情况不太可能如此。您应该在获取数据页后再次提取这些值。

  • 您在初次登录时将$viewstate变量作为__VIEWSTATE的值传递,但您在后续帖子中将其留空,而是通过$viewstate__VIEWSTATE_GUID。不确定这是否是故意的。

  • 您正在使用CURL_POSTFIELDS的数组,这可能会导致问题。 The documentation说:

      

    将数组传递给CURLOPT_POSTFIELDS会将数据编码为 multipart / form-data ,而传递URL编码的字符串会将数据编码为 application / x-www - 形式进行了urlencoded

  • 非常重要的是,不要禁用证书验证,而是fix your server setup

其他一些建议,或许更多的是风格而非实质问题。

  • 将空字符串作为CURLOPT_COOKIEFILE传递将启用会话处理,而无需保存到文件。

  • 您不需要在脚本中多次执行curl_close()curl_init();只是重用现有的句柄。这样就不必重新定义选项并重用会话cookie。

  • 使用curl_setopt_array()代码更清晰。

  • curl_exec()出错时返回false,您应该明确检查。

以下是我如何清理代码:

<?php

$url = "https://someurl/login.aspx";
$ckfile = tempnam("/tmp", "CURLCOOKIE");
$useragent = $_SERVER['HTTP_USER_AGENT'];

$username = "someuser@hotmail.co.uk";
$password = "somepassword";

$viewstate_pattern = '~<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="(.*?)" />~';
$eventval_pattern = '~<input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="(.*?)" />~';

$ch = curl_init();
curl_setopt_array($ch, [
    CURLOPT_URL            => $url,
    CURLOPT_COOKIEFILE     => "",
    CURLOPT_FOLLOWLOCATION => true,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_USERAGENT      => $useragent,
]);

// Getting the login form
$html = curl_exec($ch);

if ($html !== false) {
    preg_match($viewstate_pattern, $html, $viewstate);
    preg_match($evenval_pattern, $html, $eventValidation);
    $viewstate = $viewstate[1];
    $eventValidation = $eventValidation[1];

    $postfields = http_build_query([
        "__EVENTTARGET"=>"",
        "__EVENTARGUMENT"=>"",
        "__VIEWSTATE"=>$viewstate,
        "__EVENTVALIDATION"=>$eventValidation,
        "btnLogin"=>"Login",
        "txtPassword"=>$password,
        "txtUserName"=>$username,
    ]);

    curl_setopt_array($ch, [
        CURLOPT_REFERER=>$url,
        CURLOPT_POST=>true,
        CURLOPT_POSTFIELDS=>$postfields,
    ]);
    // Submitting the login form
    $html = curl_exec($ch);

    if ($html !== false) {
        curl_setopt_array($ch, [
            CURLOPT_URL=>'https://someurl.com/financial/quote.aspx?id=12345',
            CURLOPT_POST=>false,
        ]);

        // Getting the data page
        $html = curl_exec($ch);
        if ($html !== false) {
            preg_match($viewstate_pattern, $html, $viewstate);
            preg_match($evenval_pattern, $html, $eventValidation);
            $viewstate = $viewstate[1];
            $eventValidation = $eventValidation[1];

            $postfieldsInner = http_build_query([
                "__EVENTTARGET"=>"",
                "__EVENTARGUMENT"=>"",
        // Should this be empty?
                "__VIEWSTATE"=>"",
                "__VIEWSTATE_GUID"=>$viewstate,
                "__EVENTVALIDATION"=>$eventValidation,
                'ctl00$PageContent$btn2'=>"Accepted",
            ]);

            curl_setopt_array($ch, [
                CURLOPT_POST=>true,
                CURLOPT_POSTFIELDS=>$postfieldsInner,
            ]);
            // Posting the data page
            $html = curl_exec($ch);

            if ($html === false) {
                echo 'An error has occurred: ' . curl_error($ch);
            } else {
                var_dump($html);
            }
        } else {
            // Error getting the data page
        }
    } else {
        // Error submitting the login page
    }
} else {
    // Error getting the login page
}
curl_close();