使用cURL登录其他网站?

时间:2012-12-15 20:46:02

标签: php curl

我正在尝试使用PHP cURL登录https://www.iwgac.com/,然后检索网站的产品价格。起初,我尝试进行登录,然后回显主页以查看价格是否可见(产品价格仅在登录后才可见)。

似乎网站不接受登录(尽管cookie.txt文件发生了变化)。这是代码,基于我在stackoverflow上找到的其他答案:

class Curl {

    public $cookieJar = "";

    // Make sure the cookies.txt file is read/write permissions
    public function __construct($cookieJarFile = 'cookie.txt') {
        $this->cookieJar = $cookieJarFile;
    }

    function setup() {
        $header = array();
        $header[0]  = "Accept: text/xml,application/xml,application/xhtml+xml,";
        $header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
        $header[]   = "Cache-Control: max-age=0";
        $header[]   = "Connection: keep-alive";
        $header[]   = "Keep-Alive: 300";
        $header[]   = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
        $header[]   = "Accept-Language: en-us,en;q=0.5";
        $header[]   = "Pragma: "; // browsers keep this blank.

        curl_setopt($this->curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7');
        curl_setopt($this->curl, CURLOPT_HTTPHEADER, $header);
        curl_setopt($this->curl, CURLOPT_COOKIEJAR, $this->cookieJar);
        curl_setopt($this->curl, CURLOPT_COOKIEFILE, $this->cookieJar);
        //curl_setopt($this->curl, CURLOPT_AUTOREFERER, true);
        curl_setopt($this->curl, CURLOPT_FOLLOWLOCATION, true);
        curl_setopt($this->curl, CURLOPT_RETURNTRANSFER, true);
    }

    function get($url) {
        $this->curl = curl_init($url);
        $this->setup();

        return $this->request();
    }

    function getAll($reg, $str) {
        preg_match_all($reg, $str, $matches);
        return $matches[1];
    }

    function postForm($url, $fields, $referer = '') {
        $this->curl = curl_init($url);
        $this->setup();
        curl_setopt($this->curl, CURLOPT_URL, $url);
        curl_setopt($this->curl, CURLOPT_POST, 1);
        curl_setopt($this->curl, CURLOPT_REFERER, $referer);
        curl_setopt($this->curl, CURLOPT_POSTFIELDS, $fields);
        return $this->request();
    }

    function getInfo($info) {
        $info = ($info == 'lasturl') ? curl_getinfo($this->curl, CURLINFO_EFFECTIVE_URL) : curl_getinfo($this->curl, $info);
        return $info;
    }

    function request() {
        return curl_exec($this->curl);
    }
}

$curl = new Curl();

$url = "http://www.iwgac.com/index.php";
$fields = "form_name='main_login_form'&return_url='index.php'&user_login='**'&password='**'&remember_me='Y'&dispatch[auth.login]='Sign in'";

$html = $curl->postForm($url, $fields, $referer);
$html = curl_init();
curl_setopt($html, CURLOPT_COOKIE, 'cookie.txt');
curl_setopt($html, CURLOPT_COOKIEFILE, 'cookie.txt');
curl_setopt($html, CURLOPT_URL, 'http://www.iwgac.com');
$html = curl_exec($html);
echo $html; 

有关解决此问题的任何想法吗?

1 个答案:

答案 0 :(得分:1)

一如既往,基本:

  1. 设置curl_setopt($this->curl, CURLOPT_AUTOREFERER, true);
  2. 在提交表单之前访问登录页面
  3. 高级:使用像HttpFox这样的浏览器插件:

    1. 查看提交的确切标题和发布数据。 javascript通常会添加隐藏的值来防止你正在做的事情
    2. 查看确切的Cookie。它们可以由页面本身加载后加载的文件分配,我怀疑你是否要求curl中包含的所有文件。