使用Puppeteer单击主链接和单击子链接?

时间:2018-11-13 18:48:24

标签: javascript google-chrome automation puppeteer

简化:

我有一个带有链接的网站。
单击每个链接后,它将转到一个我需要访问链接的新页面(通过点击而不是导航)。

可视化:

enter image description here

我已经完成了99%的工作:

(async () =>
{
    const browser = await puppeteer.launch({headless: false});
    const page = await browser.newPage();
    let url = "https://www.mutualart.com/Artists";
    console.log(`Fetching page data for : ${url}...`);
    await page.goto(url);
    await page.waitForSelector(".item.col-xs-3");
    let arrMainLinks: ElementHandle[] = await page.$$('.item.col-xs-3 > a');   //get the main links

    console.log(arrMainLinks.length); // 16


    for (let mainLink of arrMainLinks) //foreach main link let's click it
    {
        let hrefValue =await (await mainLink.getProperty('href')).jsonValue();
        console.log("Clicking on " + hrefValue);
        await Promise.all([
                              page.waitForNavigation(),
                              mainLink.click({delay: 100})
                          ]);

        // let's get the sub links
        let arrSubLinks: ElementHandle[] = await page.$$('.slide >a');

        //let's click on each sub click
        for (let sublink of arrSubLinks)
        {
            console.log('██AAA');

            await Promise.all([
                                  page.waitForNavigation(),
                                  sublink.click({delay: 100})
                              ]);
            console.log('██BBB');

            // await page.goBack() 
            break; // for now ...
        }
        break;

    }

    await browser.close();
})();

那么问题出在哪里?

到达██AAA
但是它永远不会到达██BBB

我得到一个错误:

 C:\temp\puppeterr1\app>node server2.js
Fetching page data for : https://www.mutualart.com/Artists...
16
Clicking on https://www.mutualart.com/Artist/Mr--Brainwash/9B3FED6BB81E6B8E
██AAA
(node:17200) UnhandledPromiseRejectionWarning: TimeoutError: Navigation Timeout Exceeded: 30000ms exceeded
    at Promise.then (C:\temp\puppeterr1\node_modules\puppeteer\lib\FrameManager.js:1230:21)
    at <anonymous>
  -- ASYNC --
    at Frame.<anonymous> (C:\temp\puppeterr1\node_modules\puppeteer\lib\helper.js:144:27)
    at Page.waitForNavigation (C:\temp\puppeterr1\node_modules\puppeteer\lib\Page.js:599:49)
    at Page.<anonymous> (C:\temp\puppeterr1\node_modules\puppeteer\lib\helper.js:145:23)
    at Object.<anonymous> (C:\temp\puppeterr1\app\server2.js:127:30)
    at step (C:\temp\puppeterr1\app\server2.js:32:23)
    at Object.next (C:\temp\puppeterr1\app\server2.js:13:53)
    at fulfilled (C:\temp\puppeterr1\app\server2.js:4:58)
    at <anonymous>
    at process._tickCallback (internal/process/next_tick.js:188:7)
(node:17200) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:17200) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

问题:

我在这里想念什么?
为什么它不能到达██BBB?

Complete code

1 个答案:

答案 0 :(得分:0)

更新

https://github.com/GoogleChrome/puppeteer/issues/3535

原始答案:

更新,我设法解决了这个问题,但没有通过我想要的常规方式解决。

import似乎有问题。这就是为什么我转向纯 DOM 对象的原因。

我仍然对更直观的解决方案感兴趣,而不是与ElementHandle打交道:

无论如何,这是我的解决方案:

ElementHandle