节点js Puppeteer转到页面数组

时间:2018-03-20 10:03:21

标签: javascript node.js puppeteer

我尝试从我的数组中逐页进行,但得到这个:

(node:4196)MaxListenersExceededWarning:检测到可能的EventEmitter内存泄漏。添加了11个请求监听器。使用  emitter.setMaxListeners()增加限制 (node:4196)MaxListenersExceededWarning:检测到可能的EventEmitter内存泄漏。 11个帧分离的听众adde d。使用emitter.setMaxListeners()来增加限制 (node:4196)MaxListenersExceededWarning:检测到可能的EventEmitter内存泄漏。 11个lifecycleevent监听器添加 编辑。使用emitter.setMaxListeners()来增加限制 (node:4196)UnhandledPromiseRejectionWarning:错误:协议错误(Page.navigate):目标已关闭。     在Promise(D:\ Kutz \ irrParse \ node_modules \ puppeteer \ lib \ Connection.js:198:56)     在新的承诺()     在CDPSession.send(D:\ Kutz \ irrParse \ node_modules \ puppeteer \ lib \ Connection.js:197:12)     在导航(D:\ Kutz \ irrParse \ node_modules \ puppeteer \ lib \ Page.js:520:39)     在Page.goto(D:\ Kutz \ irrParse \ node_modules \ puppeteer \ lib \ Page.js:500:7)     在uniqueLinks.forEach(D:\ Kutz \ irrParse \ scrape.js:26:16)     在Array.forEach()     在D:\ Kutz \ irrParse \ scrape.js:25:15     在     at process._tickCallback(internal / process / next_tick.js:118:7) (node:4196)UnhandledPromiseRejectionWarning:未处理的承诺拒绝。这个错误源于投掷 在没有catch块的异步函数内部,或者拒绝未使用.catch()处理的promise。 (R 弹射id:1) (节点:4196)[DEP0018]弃用警告:不推荐使用未处理的拒绝承诺。在未来,承诺拒绝 未处理的离子将使用非零退出代码终止Node.js进程。 (node:4196)UnhandledPromiseRejectionWarning:错误:超出导航超时:超出30000ms     在Promise.then(D:\ Kutz \ irrParse \ node_modules \ puppeteer \ lib \ NavigatorWatcher.js:71:21)     在



const puppeteer = require("puppeteer");
var forEach = require('async-foreach').forEach;


const url = "https://reddit.com/r/programming";
const linkSelector = ".content a.title";

(async () => {
  // Launch chrome process
  const browser = await puppeteer.launch({headless: true});
  const page = await browser.newPage();

  await page.goto(url, { waitUntil: "load" });

  // This runs the `document.querySelectorAll` within the page and passes
  // the result to function
  const links = await page.$$eval(linkSelector, links => {
    return links.map((link) => link.href);
  });

  // Make sure we get the unique set of links only
  const uniqueLinks = [...links];
  //console.log(uniqueLinks[0]);

  uniqueLinks.forEach(async (link) => {
    await page.goto(link, { waitUntil: "load" });
  });

  // Kill the browser process
  await browser.close();
})();




错误抛出forEach()

1 个答案:

答案 0 :(得分:2)

不幸的是,Array.prototype.forEach的迭代器函数没有像在将其定义为异步时所期望的那样以异步方式执行。使用for循环应该适用于你想要做的事情。

for (let i = 0; i < uniqueLinks.length; i ++) {
  await page.goto(uniqueLinks[i], { waitUntil: "load" });
}