大家好,我想登录一个网站,一旦通过身份验证,便想遍历给定的一组URLS和抓取数据。这个示例可以描述我打算做的事情,但是无论如何,我都会遇到未处理的诺言拒绝。
const puppeteer = require("puppeteer");
list = [
"https://www.facebook.com/",
"https://www.google.com/",
"https://www.zocdoc.com/"
];
const getTitle = async (p, url) => {
try{
await p.goto(url);
const title = await p.title();
console.log(title);
}
catch(e) {
console.log(e)
}
return title
};
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
console.log(this)
for (var url of list) {
getTitle(page, url)
}
await browser.close();
})();
答案 0 :(得分:0)
此示例中存在多个问题。
您应该等待对getTitle函数的调用,您正在等待在函数内部,但是您也必须等待对函数的调用。
您应该用try and catch块将getTitle包围起来,并在函数内部检查是否有要返回的标题(例如google的标题为空)
const puppeteer = require("puppeteer");
list = [
"https://www.facebook.com/",
"https://www.google.com/",
"https://www.zocdoc.com/"
];
const getTitle = async (p, url) => {
try{
await p.goto(url);
const title = await p.title();
if(title){
return title
}
}
catch(e) {
throw(e)
console.log(e)
}
};
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
console.log(this)
for (var url of list) {
try{
console.log(await getTitle(page, url))
}
catch(e ){
console.log('No title')
}
}
await browser.close();
})();