强制卷曲下载页面作为浏览器

时间:2014-05-09 09:24:11

标签: php curl

如何强制curl(with)PHP将页面下载为浏览器?我想下载的页面是价格比较器,例如http://www.ceneo.pl/22416171。它是公开的,任何人都可以访问网站。

要检查是否可以进行卷曲下载,我在基于Debian的本地服务器上输入

curl http://www.ceneo.pl/22416171

它完美无缺。但我确实需要在我的虚拟PHP-Apache服务器上使用它,所以我需要使用PHP来实现它。

尝试下载基于PHP的curl页面时,它没有给我任何东西,与shell curl相反。 为什么?如何在PHP上获得正确的内容?

尝试:

                        <?php
                        $curl = curl_init(http://www.ceneo.pl/22416171);
                        curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
                        curl_setopt($curl, CURLOPT_HEADER, 1);
                        curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
                        curl_setopt($curl,CURLOPT_HTTPHEADER,
                        array(
                        'User-Agent: Mozilla/5.0 (Windows NT 6.0; rv:28.0) Gecko/20100101 Firefox/28.0',
                        'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
                        'Accept-Language: pl,en-US;q=0.7,en;q=0.3',
                        'Accept-Encoding: gzip, deflate',
                        'p3p: CP="NOI CURa ADMa DEVa TAIa OUR BUS IND UNI COM NAV INT"',
                        'Vary: Accept-Encoding',
                        'Content-Type: text/html; charset=utf-8',
                        'Cache-Control: private'
                        ));



                        $body = curl_exec($curl);
                        curl_close($curl);

                        echo $body;
                        ?>

我也尝试使用

<?php exec(curl http://www.ceneo.pl/22416171); ?>

但它给了

curl: /usr/local/lib/libcurl.so.4: no version information available (required by curl)

1 个答案:

答案 0 :(得分:1)

查看文档:{​​{3}}

您就是这样做的:

<强> test.php的

<?php

// create curl resource
$ch = curl_init();

// set url
curl_setopt($ch, CURLOPT_URL, "http://www.ceneo.pl/22416171");

//return the transfer as a string
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

//set headers
curl_setopt($ch,CURLOPT_HTTPHEADER, array(
    'User-Agent: Mozilla/5.0 (Windows NT 6.0; rv:28.0) Gecko/20100101 Firefox/28.0',
    'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Language: pl,en-US;q=0.7,en;q=0.3',
    //'Accept-Encoding: gzip, deflate',
    'p3p: CP="NOI CURa ADMa DEVa TAIa OUR BUS IND UNI COM NAV INT"',
    'Vary: Accept-Encoding',
    'Content-Type: text/html; charset=utf-8',
    'Cache-Control: private'
));

// $output contains the output string
$output = curl_exec($ch);

// close curl resource to free up system resources
curl_close($ch);

// debug
echo $output;

演示工作(仅检索网站的html输出):

enter image description here