尝试通过cURL下载PDF,获得404 - 但浏览器工作(206部分内容)

时间:2014-08-25 18:51:50

标签: php pdf curl

我无法配置cURL从特定的远程服务器下载PDF。我安装了LiveHTTTPHeaders,以便更好地了解正在发生的事情。以下输出来自通过浏览器的成功传输。有没有办法通过cURL做同样的事情,而不知道字节范围之前等等。还是不可能?

LiveHTTPHeaders输出通过浏览器成功传输:

https://www.unitedhealthcareonline.com/ccmcontent/ProviderII/UHC/en-US/Assets/ProviderStaticFiles/ProviderStaticFilesPdf/Tools%20and%20Resources/Policies%20and%20Protocols/Medical%20Policies/Medical%20Policies/Ablative_Treatment_for_Spinal_Pain.pdf

GET /ccmcontent/ProviderII/UHC/en-US/Assets/ProviderStaticFiles/ProviderStaticFilesPdf/Tools%20and%20Resources/Policies%20and%20Protocols/Medical%20Policies/Medical%20Policies/Ablative_Treatment_for_Spinal_Pain.pdf HTTP/1.1
Host: www.unitedhealthcareonline.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: https://www.unitedhealthcareonline.com/b2c/CmaAction.do?channelId=016228193392b010VgnVCM100000c520720a____
Connection: keep-alive
Range: bytes=112744-
If-Range: "ec675-272e0-4fa9e598c8240"
Cache-Control: max-age=0

HTTP/1.1 200 OK
Date: Mon, 25 Aug 2014 18:11:58 GMT
Last-Modified: Fri, 30 May 2014 13:52:01 GMT
Etag: "cc675-272e0-4fa9e598c8240"
Accept-Ranges: bytes
Content-Length: 160480
Keep-Alive: timeout=10, max=1000
Connection: Keep-Alive
Content-Type: application/pdf
Set-Cookie: BIGipServerwww.unitedhealthcareonline.com_8080=1749579530.36895.0000; expires=Mon, 25-Aug-2014 18:41:58 GMT; path=/
Set-Cookie: TSebd2a0=2004f3a590c6498a85ac153996f42ede70886359fd2aa60453fb7c6ee6ddab836091fb4f; Path=/; Secure; HTTPOnly
----------------------------------------------------------
https://www.unitedhealthcareonline.com/ccmcontent/ProviderII/UHC/en-US/Assets/ProviderStaticFiles/ProviderStaticFilesPdf/Tools%20and%20Resources/Policies%20and%20Protocols/Medical%20Policies/Medical%20Policies/Ablative_Treatment_for_Spinal_Pain.pdf

GET /ccmcontent/ProviderII/UHC/en-US/Assets/ProviderStaticFiles/ProviderStaticFilesPdf/Tools%20and%20Resources/Policies%20and%20Protocols/Medical%20Policies/Medical%20Policies/Ablative_Treatment_for_Spinal_Pain.pdf HTTP/1.1
Host: www.unitedhealthcareonline.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: https://www.unitedhealthcareonline.com/b2c/CmaAction.do?channelId=016228193392b010VgnVCM100000c520720a____
Connection: keep-alive
If-Range: "ec675-272e0-4fa9e598c8240"
Cache-Control: max-age=0
Range: bytes=131072-160479
Cookie: BIGipServerwww.unitedhealthcareonline.com_8080=1749579530.36895.0000; TSebd2a0=2004f3a590c6498a85ac153996f42ede70886359fd2aa60453fb7c6ee6ddab836091fb4f

HTTP/1.1 200 OK
Date: Mon, 25 Aug 2014 18:11:59 GMT
Last-Modified: Fri, 30 May 2014 13:52:01 GMT
Etag: "cc675-272e0-4fa9e598c8240"
Accept-Ranges: bytes
Content-Length: 160480
Keep-Alive: timeout=10, max=1000
Connection: Keep-Alive
Content-Type: application/pdf
Set-Cookie: BIGipServerwww.unitedhealthcareonline.com_8080=1749579530.36895.0000; expires=Mon, 25-Aug-2014 18:41:59 GMT; path=/
Set-Cookie: TSebd2a0=2004f3a590c6498a85ac153996f42ede70886359fd2aa60453fb7c6ee6ddab836091fb4f; Path=/; Secure; HTTPOnly
----------------------------------------------------------

cURL配置:

$cookie_file = dirname(__FILE__) . DIRECTORY_SEPARATOR . "tmp" . DIRECTORY_SEPARATOR . explode('/', $url )[2] . '.txt';

        strpos( $url, 'https' ) !== false ? $secure_connection = true : $secure_connection = false;


        $ch = curl_init();

        curl_setopt( $ch, CURLOPT_USERAGENT, 'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0' );
        curl_setopt( $ch, CURLOPT_URL, $url );
        curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, 1 );
        curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true);
        curl_setopt( $ch, CURLOPT_SSL_VERIFYPEER, FALSE );
        curl_setopt( $ch, CURLOPT_VERBOSE, true);
        curl_setopt( $ch, CURLOPT_COOKIESESSION, false );
        curl_setopt( $ch, CURLOPT_COOKIEJAR, $cookie_file );
        curl_setopt( $ch, CURLOPT_HEADER, true );
        curl_setopt( $ch, CURLINFO_HEADER_OUT, true );
        if( $secure_connection == true )
        {
            curl_setopt( $ch, CURLOPT_SSL_VERIFYPEER, 1); 
            curl_setopt( $ch, CURLOPT_SSL_VERIFYHOST, 2); 
            curl_setopt( $ch, CURLOPT_CAPATH, SERVER_ROOT . DIRECTORY_SEPARATOR . 'cacert.pem' );
            curl_setopt( $ch, CURLOPT_CERTINFO, true );
        }
        if( isset($this->referer) && $this->referer != null ) curl_setopt($ch, CURLOPT_REFERER, $this->referer);

        $errors = curl_error( $ch );
        $content = curl_exec( $ch );

        $response = curl_getinfo( $ch );

1 个答案:

答案 0 :(得分:0)

我认为没有cURL,我有一些更简单的方法可以做到这一点。 将PHP readfile与合适的头文件一起使用,如下所示:

<?php
header("Content-type:application/pdf");

// It will be called downloaded.pdf
header("Content-Disposition:attachment;filename='downloaded.pdf'");

// The PDF source is in original.pdf
readfile("https://www.unitedhealthcareonline.com/ccmcontent/ProviderII/UHC/en-US/Assets/ProviderStaticFiles/ProviderStaticFilesPdf/Tools%20and%20Resources/Policies%20and%20Protocols/Medical%20Policies/Medical%20Policies/Ablative_Treatment_for_Spinal_Pain.pdf");
?>

希望这会有所帮助