Puppeter-在iFrame中链接

时间:2019-02-25 00:56:08

标签: javascript web-scraping puppeteer

我必须在this page的要点下方找到广告链接。

我正在尝试与Puppeter合作,但由于广告是iframe而遇到麻烦!

我可以使用Chrome控制台成功获得所需的东西:

document.querySelector('#adContainer a').href

木偶

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  page.setViewport({width: 1440, height: 1000})
  await page.goto('https://www.amazon.co.uk/dp/B07DDDB34D', {waitUntil: 'networkidle2'})

  await page.waitFor(2500);

  const elementHandle = await page.$eval('#adContainer a', el => el.href);

  console.log(elementHandle);
  await page.screenshot({path: 'example.png', fullPage: false});

  await browser.close();
})();

错误:错误:找不到与选择器“ #adContainer a”匹配的元素

enter image description here

编辑:

const browser = await puppeteer.launch();
  const page = await browser.newPage();
  page.setViewport({width: 1440, height: 1000})
  await page.goto('https://www.amazon.co.uk/dp/B07DDDB34D', {waitUntil: 'networkidle2'})

const adFrame = page.frames().find(frame => frame.name().includes('"adServer":"cs'))
const urlSelector = '#sp_hqp_shared_inner > div > a';
const url = await adFrame.$eval(urlSelector, element => element.textContent);
console.log(url);


  await browser.close();

运行https://try-puppeteer.appspot.com/

2 个答案:

答案 0 :(得分:3)

您需要在框架内部进行查询,可以通过page.frames()进行访问:

const adFrame = page.frames().find(frame => frame.name().includes('<some text only appearing in name of this iFrame>');
const urlSelector = '#sp_hqp_shared_inner > div > a';
const url = await adFrame.$eval(urlSelector, element => element.textContent);
console.log(url);

我如何获得该URL的选择器: enter image description here

骗子 我自己还没有尝试过。另外,我认为在iFrame中获取该url的适当方法更像是this

const url = await adFrame.evaluate((sel) => {
  return document.querySelectorAll(sel)[0].href;
}, urlSelector);

答案 1 :(得分:0)

每次加载页面时,您都必须切换到要处理的框架。

async getRequiredLink() {
    return await this.page.evaluate(() => {
        let iframe = document.getElementById('frame_id'); //pass id of your frame
        let doc = iframe.contentDocument; // changing the context to the working frame
        let ele = doc.querySelector('you-selector'); // selecting the required element
        return ele.href;
    });
}