如何在操纵up中滚动多个iframe

时间:2019-02-22 21:27:46

标签: javascript node.js iframe web-scraping puppeteer

我正在尝试使用puppeteer使用多个iframe生成pdf。我遇到的一个问题是,如果我嵌入类似Google Maps的内容,则Google Maps会延迟加载(仅当元素位于浏览器的视点时才加载。 一种解决方案是滚动浏览页面上的不同iframe,并为每个iframe加载设置等待时间。

这是我到目前为止(可以在https://try-puppeteer.appspot.com/中进行测试)的p:1.9.0版,我也在1.12.0中尝试过,无法使滚动工作或超时。

const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setViewport({
  width: 1280,
  height: 750
});
await page.emulateMedia('screen');
const html = '<iframe src="https://www.google.com/maps/embed?pb=!1m14!1m8!1m3!1d12077.188806999058!2d-73.2243774!3d40.8214352!3m2!1i1024!2i768!4f13.1!3m3!1m2!1s0x0%3A0x9e562057f79c0860!2sH+Lee+Dennison+Building!5e0!3m2!1sen!2sus!4v1547750310674" height="250" width="600" allowfullscreen=""></iframe><div><iframe src="https://www.google.com/maps/embed?pb=!1m14!1m8!1m3!1d12077.188806999058!2d-73.2243774!3d40.8214352!3m2!1i1024!2i768!4f13.1!3m3!1m2!1s0x0%3A0x9e562057f79c0860!2sH+Lee+Dennison+Building!5e0!3m2!1sen!2sus!4v1547750310674" height="250" width="600" allowfullscreen=""></iframe></div><div><iframe src="https://www.google.com/maps/embed?pb=!1m14!1m8!1m3!1d12077.188806999058!2d-73.2243774!3d40.8214352!3m2!1i1024!2i768!4f13.1!3m3!1m2!1s0x0%3A0x9e562057f79c0860!2sH+Lee+Dennison+Building!5e0!3m2!1sen!2sus!4v1547750310674" height="250" width="600" allowfullscreen=""></iframe></div><div><iframe src="https://www.google.com/maps/embed?pb=!1m14!1m8!1m3!1d12077.188806999058!2d-73.2243774!3d40.8214352!3m2!1i1024!2i768!4f13.1!3m3!1m2!1s0x0%3A0x9e562057f79c0860!2sH+Lee+Dennison+Building!5e0!3m2!1sen!2sus!4v1547750310674" height="250" width="600" allowfullscreen=""></iframe></div>'

await page.setContent(html, { waitUntil: 'networkidle0' });
const frames = await page.mainFrame().childFrames(); // get all the iframes on that page. 
await page.evaluate((frames) => {
     // this part does not work
     for (let i=0, i<frames.length; i++){
        setTimeout(() => {
         document.querySelectorAll('iframe')[i].scrollIntoView();
        }, 2000)
     }
  }, frames)
const pdf = await page.pdf({
  scale: 1,
  printBackground: true,
  margin: { bottom: 0 },
  path: 'screenshot.pdf'
});

await browser.close();

感谢您的帮助!

1 个答案:

答案 0 :(得分:1)

此代码存在一些问题:

  1. frames是Node.js上下文中不可序列化的对象,因此无法按原样在浏览器上下文中传输。
  2. 所有setTimeout()回调将在2秒后立即被调用,因此每个帧将没有足够的时间加载。
  3. 不等待这些setTimeout()回调:page.evaluate()在经过这2秒之前返回,并且在创建iframe之前创建pdf。

您可以尝试这种方法:

// page loaded

await page.evaluate(async () => {
  for (const iframe of Array.from(document.querySelectorAll('iframe'))) {
    iframe.scrollIntoView();
    await new Promise((resolve) => { setTimeout(resolve, 2000); });
  }
});

// pdf creation