我正在尝试获取所有请求标头以正确检查请求,但是它只返回诸如User-Agent和Origin之类的标头,而原始请求包含更多标头。
有没有一种方法可以真正获取所有标头?
供参考,下面是代码:
const puppeteer = require('puppeteer');
const browser = await puppeteer.launch({
headless: false
});
const page = await browser.newPage();
page.on('request', req => {
console.log(req.headers());
});
await page.goto('https://reddit.com');
预先感谢,iLinked
答案 0 :(得分:0)
您可以使用 url https://headers.cloxy.net/request.php 查看您的标题
await page.goto('https://headers.cloxy.net/request.php');
U 也可以打印到日志
console.log((await page.goto('https://example.org/')).request().headers());
答案 1 :(得分:0)
您可以从 puppeteer 切换到 playwright,然后使用 Firefox(但不是 Chromium 或 WebKit)您将获得更多标题:
import playwright from 'playwright';
(async () => {
const browser = await playwright['firefox'].launch();
const page = await browser.newPage();
page.on('request', req => {
console.log(req.headers());
});
await page.goto("https://example.com/");
await browser.close();
})();
playwright['firefox']
输出(在其他网站上我也看到过 cookie):
{
host: 'example.com',
'user-agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:86.0) Gecko/20100101 Firefox/86.0',
accept: 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
'accept-language': 'en-US,en;q=0.5',
'accept-encoding': 'gzip, deflate, br',
connection: 'keep-alive',
'upgrade-insecure-requests': '1'
}
对比playwright['chromium']
输出:
{
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/90.0.4421.0 Safari/537.36'
}