无法使用Curl下载URL

时间:2014-03-15 10:00:17

标签: php curl

我尝试使用以下代码下载网址http://es.extpdf.com/nagore-pdf.html。但我得到的状态代码为0作为回报。但是当从http://web-sniffer.net/访问它时,它会显示301重定向。我的代码似乎也适用于301重定向的URL。

可能是什么问题?

<?php


print disavow_download_url("http://es.extpdf.com/nagore-pdf.html");

function disavow_download_url($url) {

    $custom_headers = array();
    $custom_headers[] = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
    $custom_headers[] = "Pragma: no-cache";
    $custom_headers[] = "Cache-Control: no-cache";
    $custom_headers[] = "Accept-Language: en-us;q=0.7,en;q=0.3";
    $custom_headers[] = "Accept-Charset: utf-8,windows-1251;q=0.7,*;q=0.7";

    $ch = curl_init();
    $useragent = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0.1) Gecko/20100101 Firefox/9.0.1";
    curl_setopt($ch, CURLOPT_USERAGENT, $useragent); // set user agent
    curl_setopt($ch, CURLOPT_URL, $url);

    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
    //curl_setopt($ch, CURLOPT_HEADER, false);
    curl_setopt($ch, CURLOPT_HTTPHEADER, $custom_headers);

    //these two from https
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);


    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 15);
    curl_setopt($ch, CURLOPT_TIMEOUT, 10); //timeout in seconds

    $txResult = curl_exec($ch);

    $statuscode = curl_getinfo($ch, CURLINFO_HTTP_CODE);

    print "statuscode=$statuscode\n";

    print "result=$txResult\n";


}

1 个答案:

答案 0 :(得分:1)

该网址可从美国访问,而不是从您所在的地区访问。它适用于web-sniffer,因为它们的服务器托管在美国(或者是extpdf允许的某个区域)。

我使用了curl的USA代理,它返回了数据。

curl_setopt($ch, CURLOPT_PROXY, "100.9.90.1:3128"); // change IP, Port