我想要用无头铬和木偶戏实现的目标:
根据此错误,无头铬无法导航到pdf文件: https://bugs.chromium.org/p/chromium/issues/detail?id=761295
因此,我尝试从当前的伪操纵者会话中获取cookie,并通过https.get请求将其传递,但不幸的是没有成功。
我的代码:
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.goto('https://login-page', { waitUntil: 'networkidle0' });
await page.type('#email', 'email');
await page.type('#password', 'password');
await page.click('input[type="submit"]');
await page.waitForNavigation({ waitUntil: 'networkidle0' });
// following line throws an error with headless mode
// await page.goto('https://url-with-pdf-accessible-only-after-login');
// I'm trying to convert cookie object to cookie string to pass it with headers
const cookies = await page.cookies();
let cookieString = '';
for (index in cookies) {
const cookie = cookies[index];
for (key in cookie) {
cookieString += key + '=' + cookie[key] + '; ';
}
}
// following code save me empty file (0 bytes)
const file = fs.createWriteStream('file.pdf');
https.get({
hostname: 'host-with-pdf-file',
path: '/path-to-pdf-accessible-only-after-login,
headers: {
'Cookie': cookieString,
}
}, res => {
res.pipe(file);
});
我做错什么了吗?
还有其他方法可以将url(需要身份验证)中的pdf文件保存到服务器吗?
答案 0 :(得分:2)
我遇到了几乎相同的问题。
信息:我正在Windows 10 64位,节点v8.9.4,木偶1.12.2上运行它
更多重要信息:不适用于嵌入式“ local-chromium”(puppeteer安装的73.0.3679.0(64位)),但适用于已安装的Chrome! (72.0.3626.119),所以我为启动方法实现了自定义的“ executablePath”属性:),它可以正常工作!
我搜索了几个小时,所以我希望这个解决方案可以有用...
const puppeteer = require('puppeteer');
(async () => {
// Custom browser, headless not present Eq to true
const browser = await puppeteer.launch({executablePath: 'C:/\Program Files (x86)/\Google/\Chrome/\Application/\chrome.exe'});
const page = await browser.newPage();
//URL
await page.goto('https://www.theUrl', {waitUntil: 'networkidle2'});
await page.waitFor('input[name=NameOfTheLoginHtmlField]');
await page.$eval('input[name=NameOfTheLoginHtmlField]', el => el.value = 'InputValueOfTheLoginHtmlField');
await page.waitFor('input[name=NameOfThePasswordHtmlField]');
await page.$eval('input[name=NameOfThePasswordHtmlField]', el => el.value = 'InputValueOfTheLoginHtmlField');
//The submit button has been replaced by an "a" with js function behind, so ...
await page.click('#login-submit > a');
//Allow to define the download path ('' = current directory : C:\Program Files (x86)\Google\Chrome\Application\72.0.3626.119)
function setDownloadBehavior(downloadPath=''){
return page._client.send('Page.setDownloadBehavior', {
behavior: 'allow',
downloadPath
});
}
await setDownloadBehavior();
await page.waitFor(5000);
await browser.close();
})()
答案 1 :(得分:-1)
您可以使用express.js响应pdf文件吗?
res.sendFile(path.join(__ dirname,'example.pdf'));
example.pdf 是从您的服务器生成的文件