使用cURL检索网站HTML页面,其中包含安全页面上的当前会话和cookie数据

时间:2016-02-14 11:11:36

标签: php session curl cookies file-get-contents

问题全面修订于:2月19日

我想要的东西(简称):

我想使用cURL获取一个HTML页面,该页面使用用户登录进行保护(在cURL请求时用户已登录并拥有页面权限)。

更详细:

情况是用户位于index.php?w=2344&y=lalala&x=something之类的网页上,该网页受到保护(通过安全脚本class.Firewizz.Security.php)。在该页面上有一个"打印为pdf"按钮。这会将用户发送到页面getPDF.php,此页面会查看请求的来源并使用cURL获取该页面,并且该输出将以PDF打印的形式发送到浏览器。

但是现在我在getPDF.php页面中将页面变量设置为静态,因此它不会检查引荐来源,并且我100%确定它尝试获取的页面是否正确。

此外,输出只是按原样回显,并且尚未转换为PDF,只是为了不会干扰问题。

预期输出现在与用户转到页面时的输出相同。除了情况并非如此,用户什么也得不到。

我们知道什么? 我们知道$_SESSION数据没有被发送到cURL,我知道这是因为我回复了输出文件中的$_SESSION数据,表示它们是空的。

经过大量尝试后,我们仍然没有解决方案,仍然没有' $ _ SESSION'数据

我不想以任何方式破坏安全脚本,因此解决方案"删除ini_set('session.use_only_cookies', 1);不是我想要的。"

根据请求(专用于帮助的人)我可以发送完整的脚本文件,但我会在下面发布相关的片段。

class.Firewizz.Security.php

<?php

/*
 * Firewizz UserLogin
 */

namespace Firewizz;



class Security
{

     // Start the session, with Cookie data
    public function Start_Secure_Session()
    {
        // Forces sessions to only use cookies.
        ini_set('session.use_only_cookies', 1);

        // Gets current cookies params
        $cookieParams = session_get_cookie_params();

        // Set Cookie Params
        session_set_cookie_params($cookieParams["lifetime"], $cookieParams["path"], $cookieParams["domain"], $this->isHTTPS, $this->deny_java_session_id);
        // Sets the session name
        session_name($this->session_name);

        // Start the php session
        session_start();

        // If new session or expired, generate new id
        if (!isset($_SESSION['new_session']))
        {
            $_SESSION['new_session'] = "true";

            // regenerate the session, delete the old one.
            session_regenerate_id(true);
        }
    }

    // Check of user is logged in to current session, return true or false;
    public function LOGGED_IN()
    {
        return $this->_login_check();
    }

    public function LOGOUT()
    {
    // Unset all session values
        $_SESSION = array();

        // get session parameters
        $params = session_get_cookie_params();

        // Delete the actual cookie.
        setcookie(session_name(), '', time() - 42000, $params["path"], $params["domain"], $params["secure"], $params["httponly"]);
        // Destroy session
        session_destroy();
        if (!headers_sent())
        {
            header("Location: " . $this->login_string, true);
        }
        else
        {
            echo '<script>window.location="/"</script>';
        }
    }

    // Must pass variables or send to login page!
    public function BORDER_PATROL($user_has_to_be_logged_in, $page_loaded_from_index)
    {
        $pass_border_partrol = true;

        if (!$this->LOGGED_IN() && $user_has_to_be_logged_in)
        {
            $pass_border_partrol = false;
        }
        if (filter_input(INPUT_SERVER, "PHP_SELF") != "/index.php" && $page_loaded_from_index)
        {
            $pass_border_partrol = false;
        }

        // Kick to login on fail
        if (!$pass_border_partrol)
        {
            $this->LOGOUT();
            exit();
        }

    }

    // Catch login, returns fail string or false if no errors
    public function CATCH_LOGIN()
    {
        if (filter_input(INPUT_POST, "id") == "login" && filter_input(INPUT_POST, "Verzenden") == "Verzenden")
        {
            // Variables from form.
            $email = filter_input(INPUT_POST, "email");
            $sha512Pass = filter_input(INPUT_POST, "p");

            // Database variables
            $db_accounts = mysqli_connect($this->mySQL_accounts_host, $this->mySQL_accounts_username, $this->mySQL_accounts_password, $this->mySQL_accounts_database);

            // Prepage sql
            if ($stmt = $db_accounts->prepare("SELECT account_id, verified, blocked ,login_email, login_password, login_salt, user_voornaam, user_tussenvoegsel, user_achternaam FROM accounts WHERE login_email = ? LIMIT 1"))
            {
                $stmt->bind_param('s', $email); // Bind "$email" to parameter.
                $stmt->execute(); // Execute the prepared query.
                $stmt->store_result();

                $stmt->bind_result($user_id, $verified, $blocked, $email, $db_password, $salt, $voornaam, $tussenvoegsel, $achternaam); // get variables from result.

                $stmt->fetch();
                $password = hash('sha512', $sha512Pass . $salt); // hash the password with the unique salt.
                $tussen = ' ';
                if ($tussenvoegsel != "")
                {
                    $tussen = " " . $tussenvoegsel . " ";
                }
                $username = $voornaam . $tussen . $achternaam;



                if ($stmt->num_rows == 1)
                { // If the user exists
                    // Check blocked
                    if ($blocked == "1")
                    {
                        return 'Deze acount is geblokkeerd, neem contact met ons op.';
                    }

                    // We check if the account is locked from too many login attempts
                    if ($this->_checkBrute($user_id, $db_accounts) == true)
                    {
                        // Account is locked
                        // Send an email to user saying their account is locked
                        return "Te vaak fout ingelogd,<br />uw account is voor " . $this->blockout_time . " minuten geblokkerd.";
                    }
                    else
                    {
                        if ($db_password == $password && $verified == 1)
                        {
                            // Password is correct!, update lastLogin
                            if ($stmt = $db_accounts->prepare("UPDATE accounts SET date_lastLogin=? WHERE account_id=?"))
                            {
                                $lastlogin = date("Y-m-d H:i:s");

                                $stmt->bind_param('ss', $lastlogin, $user_id); // Bind "$email" to parameter.
                                $stmt->execute();
                                $stmt->close();
                            }

                            $ip_address = $_SERVER['REMOTE_ADDR']; // Get the IP address of the user.
                            $user_browser = $_SERVER['HTTP_USER_AGENT']; // Get the user-agent string of the user.

                            $user_id = preg_replace("/[^0-9]+/", "", $user_id); // XSS protection as we might print this value
                            $_SESSION['user_id'] = $user_id;
                            $username = $username; // XSS protection as we might print this value
                            $_SESSION['username'] = $username;
                            $_SESSION['login_string'] = hash('sha512', $password . $ip_address . $user_browser);
                            // Login successful.

                            if ($this->MailOnLogin != FALSE)
                            {
                                mail($this->MailOnLogin, 'SECUREPLAY - LOGIN', $username . ' logged in to the secureplay platform..');
                            }
                            return false;
                        }
                        else
                        {
                            // Password is not correct
                            // We record this attempt in the database
                            $now = time();
                            $db_accounts->query("INSERT INTO login_attempts (userID, timestamp) VALUES (" . $user_id . ", " . $now . ")");

                            return "Onbekende gebruikersnaam en/of wachtwoord.";
                        }
                    }
                }
                else
                {
                    return "Onbekende gebruikersnaam en/of wachtwoord.";
                }
            }
            else
            {
                return 'SQL FAIL! ' . mysqli_error($db_accounts);
            }
            return "Onbekende fout!";
        }


        return false;
    }

    private function _checkBrute($user_id, $db_accounts)
    {
        // Get timestamp of current time
        $now = time();
        // All login attempts are counted from the past 2 hours.
        $valid_attempts = $now - ($this->blockout_time * 60);

        if ($stmt = $db_accounts->prepare("SELECT timestamp FROM login_attempts WHERE userID = ? AND timestamp > $valid_attempts"))
        {
            $stmt->bind_param('i', $user_id);
            // Execute the prepared query.
            $stmt->execute();
            $stmt->store_result();
            // If there has been more than 5 failed logins
            if ($stmt->num_rows > $this->max_login_fails)
            {
                return true;
            }
            else
            {
                return false;
            }
        }
        else
        {
            return true;
        }
    }

    // Login Check if user is logged in correctly
    private function _login_check()
    {
        // Database variables
        $db_accounts = mysqli_connect($this->mySQL_accounts_host, $this->mySQL_accounts_username, $this->mySQL_accounts_password, $this->mySQL_accounts_database);

        // Check if all session variables are set
        if (isset($_SESSION['user_id'], $_SESSION['username'], $_SESSION['login_string']))
        {
            $user_id = $_SESSION['user_id'];
            $login_string = $_SESSION['login_string'];
            $username = $_SESSION['username'];
            $ip_address = $_SERVER['REMOTE_ADDR']; // Get the IP address of the user.
            $user_browser = $_SERVER['HTTP_USER_AGENT']; // Get the user-agent string of the user.

            if ($stmt = $db_accounts->prepare("SELECT login_password FROM accounts WHERE account_id = ? LIMIT 1"))
            {
                $stmt->bind_param('i', $user_id); // Bind "$user_id" to parameter.
                $stmt->execute(); // Execute the prepared query.
                $stmt->store_result();

                if ($stmt->num_rows == 1)
                { // If the user exists
                    $stmt->bind_result($password); // get variables from result.
                    $stmt->fetch();
                    $login_check = hash('sha512', $password . $ip_address . $user_browser);
                    if ($login_check == $login_string)
                    {
                        // Logged In!!!!
                        return $user_id;
                    }
                    else
                    {
                        // Not logged in
                        return false;
                    }
                }
                else
                {
                    // Not logged in
                    return false;
                }
            }
            else
            {
                // Not logged in
                //die("f3");
                return false;
            }
        }
        else
        {
            // Not logged in
            return false;
        }
    }

}

secured_page

<?php
require_once 'assets/class.Firewizz.Security.php';

if (!isset($SECURITY))
{
    $SECURITY = new Firewizz\Security();
}

// Check if user is logged in or redirect to login page;
$SECURITY->BORDER_PATROL(true, true);


// CONTENT bla bla

?>

getPDF.php

<?php
// Requires
require_once 'assets/class.FirePDF.php';
require_once 'assets/class.Firewizz.Security.php';
$SECURITY = new \Firewizz\Security();
$SECURITY->Start_Secure_Session();

// Html file to scrape, if this works replace with referer so the page that does the request gets printed.(prepend by security so it can only be done from securePlay
$html_file = 'http://www.website.nl/?p=overzichten&sort=someSort&s=67';

// Output pdf filename
$pdf_fileName = 'Test_Pdf.pdf';

/*
 * cURL part
 */

// create curl resource
$ch = curl_init();

// set source url
curl_setopt($ch, CURLOPT_URL, $html_file);

// set cookies
$cookiesIn = "user_id=" . $_SESSION['user_id'] . "; username=" . $_SESSION['username'] . "; login_string=" . $_SESSION['login_string'] . ";";

// set cURL Options
$tmp = tempnam("/tmp", "CURLCOOKIE");
if ($tmp === FALSE)
{
    die('Could not generate a temporary cookie jar.');
}

$options = array(
    CURLOPT_RETURNTRANSFER => true, // return web page
    //CURLOPT_HEADER => true, //return headers in addition to content
    CURLOPT_ENCODING => "", // handle all encodings
    CURLOPT_AUTOREFERER => true, // set referer on redirect
    CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
    CURLOPT_TIMEOUT => 120, // timeout on response
    CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
    CURLINFO_HEADER_OUT => true,
    CURLOPT_SSL_VERIFYPEER => false, // Disabled SSL Cert checks
    CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
    CURLOPT_COOKIEJAR => $tmp,
    //CURLOPT_COOKIEFILE => $tmp,
    CURLOPT_COOKIE => $cookiesIn
);

// $output contains the output string
curl_setopt_array($ch, $options);
$output = curl_exec($ch);

// close curl resource to free up system resources
curl_close($ch);

// output the cURL
echo $output;
?>

我们如何测试

当前测试是通过将用户登录到我们想要通过cURL获得的正确页面并验证他看到页面(哪个有效)来完成的。现在我们在新标签页中运行getPDF.php页面。由于安全性失败,我们在其中看到一个空白页面。如果我们添加echo "session data:" . $_SESSION["login_string"];在安全脚本中,我们看到$ _SESSION中的变量是空白的。当我们在getPDF.php中粘贴同一行时,我们看到它正在那里设置。所以我们知道事实并没有被cURL转移。

一些简短的信息。

  • 所以上面的代码我们得到一个空白页面;
  • $ _ SESSION数据未发送;
  • 非常确定Cookie不会发送;
  • 尝试了各种cURL设置,但都没有成功;
  • 如果传递了所有$ _SESSION和$ _COOKIE数据,那将是完美的;
  • 尝试在评论或答案中说出一切。

3 个答案:

答案 0 :(得分:5)

确定已解决

经过大量研究。

传递Cookie数据但不会使其成为会话数据。 这是使用以下方法修复的:

private function Cookie2Session($name)
{
    if (filter_input(INPUT_COOKIE, $name))
    {
        $_SESSION[$name] = filter_input(INPUT_COOKIE, $name);
    }
}

// following lines put within the BORDER_PATROL Method
if (filter_input(INPUT_COOKIE, 'pdfCurl'))
{
    $this->Cookie2Session('user_id');
    $this->Cookie2Session('username');
    $this->Cookie2Session('login_string');
    $this->Cookie2Session('REMOTE_ADDR');
    $this->Cookie2Session('HTTP_USER_AGENT');
    $_SESSION['new_session'] = "true";
}

方法_login_check()

的小改动
// Login Check if user is logged in correctly
private function _login_check()
{
    // Database variables
    $db_accounts = mysqli_connect($this->mySQL_accounts_host, $this->mySQL_accounts_username, $this->mySQL_accounts_password, $this->mySQL_accounts_database);

    // Check if all session variables are set
    if (isset($_SESSION['user_id'], $_SESSION['username'], $_SESSION['login_string']))
    {
        $user_id = $_SESSION['user_id'];
        $login_string = $_SESSION['login_string'];
        $username = $_SESSION['username'];
        $ip_address = $_SERVER['REMOTE_ADDR']; // Get the IP address of the user.
        $user_browser = $_SERVER['HTTP_USER_AGENT']; // Get the user-agent string of the user.

// =====>> add this code, because cURL req comes from server. <<=====
        if (isset($_SESSION["REMOTE_ADDR"]) && ($_SERVER['REMOTE_ADDR'] == $_SERVER['SERVER_ADDR']))
        {
            $ip_address = $_SESSION["REMOTE_ADDR"];
        }

// {rest of code}

getPHP.php文件的小更新:

<?php
// Requires
require_once 'assets/class.FirePDF.php';
require_once 'assets/class.Firewizz.Security.php';
$SECURITY = new \Firewizz\Security();
$SECURITY->Start_Secure_Session();

// Html file to scrape, if this works replace with referer so the page that does the request gets printed.(prepend by security so it can only be done from securePlay
$html_file = 'http://www.secureplay.nl/?p=overzichten&sort=SpeelplaatsInspecties&s=67';

// Output pdf filename
$pdf_fileName = 'Test_Pdf.pdf';

/*
 * cURL part
 */

// create curl resource
$ch = curl_init();

// set source url
curl_setopt($ch, CURLOPT_URL, $html_file);

// set cookies
$cookiesIn = "user_id=" . $_SESSION['user_id'] . "; username=" . $_SESSION['username'] . "; login_string=" . $_SESSION['login_string'] . "; pdfCurl=true; REMOTE_ADDR=" . $_SERVER['REMOTE_ADDR'] . "; HTTP_USER_AGENT=" . $_SERVER['HTTP_USER_AGENT'];
$agent = $_SERVER['HTTP_USER_AGENT'];

// set cURL Options
$tmp = tempnam("/tmp", "CURLCOOKIE");
if ($tmp === FALSE)
{
    die('Could not generate a temporary cookie jar.');
}

$options = array(
    CURLOPT_RETURNTRANSFER => true, // return web page
    //CURLOPT_HEADER => true, //return headers in addition to content
    CURLOPT_ENCODING => "", // handle all encodings
    CURLOPT_AUTOREFERER => true, // set referer on redirect
    CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
    CURLOPT_TIMEOUT => 120, // timeout on response
    CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
    CURLINFO_HEADER_OUT => true,
    CURLOPT_SSL_VERIFYPEER => false, // Disabled SSL Cert checks
    CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
    CURLOPT_COOKIEJAR => $tmp,
    //CURLOPT_COOKIEFILE => $tmp,
    CURLOPT_COOKIE => $cookiesIn,
    CURLOPT_USERAGENT => $agent
);

// $output contains the output string
curl_setopt_array($ch, $options);
$output = curl_exec($ch);

// close curl resource to free up system resources
curl_close($ch);

// output the cURL
echo $output;
?>

根据上述知识,您可以完全使用cURL访问包含当前会话数据的安全页面,而您的安全性只会有轻微的后果。

答案 1 :(得分:1)

因此,您的$cookiesIn需要定义Cookie。我将根据您的代码段做一个示例:

$cookiesIn = "user_id=" . $_SESSION['user_id'] . "; username=" . $_SESSION['username'] . "; login_string=" . $_SESSION['login_string'] . ";";

尝试在pdfCreator页面中进行设置。将$cookiesIn = "";替换为上面的行,看看是否会给您带来不同的结果。

此外,这里是cURL选项cookie的绝佳参考:

https://curl.haxx.se/libcurl/c/CURLOPT_COOKIE.html

如果您希望发送所有Cookie而不是指定它们,请使用以下代码:

$tmp = tempnam("/tmp", "CURLCOOKIE");
if($tmp === FALSE) die('Could not generate a temporary cookie jar.');

$options = array(
    CURLOPT_RETURNTRANSFER => true, // return web page
    //CURLOPT_HEADER => true, //return headers in addition to content
    CURLOPT_ENCODING => "", // handle all encodings
    CURLOPT_AUTOREFERER => true, // set referer on redirect
    CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
    CURLOPT_TIMEOUT => 120, // timeout on response
    CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
    CURLINFO_HEADER_OUT => true,
    CURLOPT_SSL_VERIFYPEER => false, // Disabled SSL Cert checks
    CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
    CURLOPT_COOKIEJAR => $tmp,
    CURLOPT_COOKIEFILE => $tmp,
);

此代码将使用COOKIEJAR选项转储所有当前已知的cookie,以便在cURL中使用。然后,当我们指定COOKIEFILE时,我们会指定cURL应该在请求中包含Cookie的位置。

那就是说,我已经摆脱了$cookiesIn引用,因为如果你使用上面的代码就不需要它。

答案 2 :(得分:0)

在这种情况下,如果会话控制算法合理,您只需更改页面发送的格式。

使用cURL重新获取页面是一种方法,但它看起来像一个XY问题;你实际上想要使用cURL,你想要控制输出格式,HTML或PDF。

一个可行的选择是在添加特定参数后重新加载页面,该参数将被注入页面上下文并修改输出函数。例如,您可以将整个页面包装在输出缓冲气泡中:

// Security checks as usual, then:

if (array_key_exists('output', $_GET)) {
    $format = $_GET['output']; // e.g. "pdf"
    // We could check whether the response handler has a printAs<FORMAT> method
    switch ($format) {
        case 'pdf': $outputFn = 'printAsPDF'; break;
        default:
            throw new \Exception("Output in {$format} format not supported");
    }
    ob_start($output);
}
// Page is generated normally

&quot; printAsPDF&#39;输出将接收页面内容,并使用dompdf或wkhtml2pdf等格式将其格式化为PDF文件,添加适当的Content-Type标题,并返回格式化的PDF。

安全性保持不变,修改实际上可以在请求解码阶段实现。具有当前使用的输出格式的状态变量可以被其他对象访问,这使得它们根据情况使其行为不同(例如,generateMenu()函数可以选择立即返回而不是显示不会出现的情况。在PDF中有意义。