使用PHP和Cookie从URL下载pdf文件

时间:2020-07-12 20:54:59

标签: php pdf curl cookies session-cookies

我正在为此程序使用PHP。我创建了一个脚本,该脚本登录一个网站,然后从该网站转到另一个页面并收集pdf链接。那个超级酷。对于请求,我使用CURL并在cookie.txt文件中写入cookie。

当我选择pdf链接时,这是URL(newest_invoice_url):

https://secure3.billerweb.com/alw/inetSrv?client=701122300&type=ApplicationMenu&authTkn=r7mdbUaDHRqEWm4sbY1q40uvMhNQFjwDVOzhEZaAyFrGMyD0pTgCde/QHTSUf0YrJJw5PThCcAV8O7a2ipV5IZmi7jNGArIZ1ddy-WEJcuNAbMp0VRMnNO9VFCA7s78vwzB9VqCn/Qm5zfyzXH2YoNaqMATofgc3v-SSLYzzV3LGGVlvAF6TPS4JgPbcdxotOwJaGhPsM0qfoBiE74zECNA3o-x9-Amqa-/3Q1cRFUa3LtV77l/mDG1pQP1KbISD27TfSPvG1IgyzOz1BP6n0Ah4WksP95yAMhDUuaViKECBpC4n3n-SdyHyQpXz8UsPS8F3WWwj7aMwFvp2aaHRnP5j3uVl4gRGw1l3ee5BLmhkBZFdB57bcAP2Pinmk4krIuqvzjCZM780j9lMQ7E/lS-KqWQN/zrGF3JOg6CP3HYtna6Ne-XgseIxsf-Ecu7qKfZ-DRCSXtv1ulnn8PW1btvjgUeS-Aia8mo9T3CzUmVnbdkOG0JfrrT9mjHOUevqZb-48776CX9svPxujKVFjHELPX5E8bXzv2yIWyMoHfqpaVm1D2B2BP56GYwD5OQL0a5BNnsEjHBIpMMUCsGAbpXl0bABleBx&unitCode=ALW&keyTkn=W9EqaRvq0FQ_Ij9Nj1zRaPoc88pUKFGSOYlgLKY6qQxuLnHdhQ38Gq8CUviDeObpxvo46fbZ1XRjPTzMc0tTqCphM9Q79QjheglAX2Ay2Gzdo2r0KnRx9gBqoAyKSHk/93tuEmVAbszFdYfP0E-sJRCUWvnNUTDOe6TyEeblIzQ7wEDrNt4nmEI_&slchannel=ALWRSA&enc=web

它重定向到另一个站点。我写了这段代码:

$ch = curl_init();
$source = $newest_invoice_url;
curl_setopt($ch, CURLOPT_URL, $source);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_ENCODING, 'gzip, deflate');
curl_setopt($ch, CURLOPT_HEADER,1);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE,'cookie.txt');
$data = curl_exec ($ch);
curl_close ($ch);

print_r($data);

preg_match("~Location:\s*(.*)~",$data,$new_link);
if(!empty($new_link[1])){
    echo"\nNew link is \n";
    print_r($new_link[1]);echo"\n\n";
}

它给了我这个代码:

HTTP/1.1 302 Moved Temporarily
Date: Sun, 12 Jul 2020 20:44:34 GMT
Content-length: 0
Content-type: text/html
Set-cookie: _stkn-hfa3QS7R=Ivqn7zeiy9sFXjnqsO3xxg__; Path=/alw/inetSrv; HttpOnly; Secure; SameSite=None
Strict-transport-security: max-age=31536000
Cache-control: no-store,no-cache,max-age=0,must-revalidate
Pragma: no-cache
Content-security-policy: child-src 'self'; connect-src 'self' https://www.google-analytics.com; default-src 'self'; font-src 'self' https://fonts.gstatic.com; frame-ancestors 'self'; img-src * data: https://www.google-analytics.com; script-src 'self' 'unsafe-inline' 'unsafe-eval' https://www.google-analytics.com; style-src 'self' 'unsafe-inline' https://fonts.googleapis.com;
X-content-type-options: nosniff
X-xss-protection: 1; mode=block
Location: https://secure3.billerweb.com/alw/inetSrv?sessionHandle=hfa3QS7RCA2EYOpRdKYD4VH3wXUsdfCzuCiEP7ugnN6-xxSNY2w9NzAxMTIyMzAwJlJlcXVlc3RUeXBlPVNob3dQZGY_&client=701122300&type=CompatPresentmentService&action=ShowPdf&firstPage=true

在位置块中,我需要一个新的URL,我使用相同的代码,但是无法将pdf文件保存在本地,有帮助吗?

更新: 第一要求

$ch = curl_init();
$source = $newest_invoice_url;
curl_setopt($ch, CURLOPT_URL, $source);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_ENCODING, 'gzip, deflate');
curl_setopt($ch, CURLOPT_HEADER,1);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie2.txt');
//curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_COOKIEFILE,'cookie2.txt');
$data = curl_exec ($ch);
curl_close ($ch);

print_r($data);

我使用结果获取最终的重定向链接

preg_match("~Location:\s*(.*)~",$data,$new_link);
if(!empty($new_link[1])){
    echo"\nNew link is \n";
    print_r($new_link[1]);echo"\n\n";
    $newlink = $new_link[1];
}

第二个请求

$ch = curl_init();
$source = $newlink;
curl_setopt($ch, CURLOPT_URL, $source);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_ENCODING, 'gzip, deflate');
curl_setopt($ch, CURLOPT_HEADER,1);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie2.txt');
//curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);



curl_setopt($ch, CURLOPT_COOKIEFILE,'cookie2.txt');
$data = curl_exec ($ch);
curl_close ($ch);
print_r($data);

第二次请求我收到空白,我应该收到pdf内容

0 个答案:

没有答案