Puppeteer 1.16.0无法导航到https://google.ca?

时间:2019-05-20 02:39:22

标签: node.js npm puppeteer

运行npm audit fix修复漏洞后,pupetteer不再能够导航到任何内容,甚至不能导航到google。

npm列表显示我正在使用1.16.0。

在名为invoice_to_pdf.js的脚本中:

  2 const puppeteer = require('puppeteer');
...
 18 const headers = new Map();
...
 31 console.log("Starting " + new Date());
 32 
 33 
 34 (async () => {
 35   const browser = await puppeteer.launch();
 36   const page = await browser.newPage();
 37   await page.setExtraHTTPHeaders(headers);
 38   page.setDefaultNavigationTimeout(50000)
 39 
 40   process.on("unhandledRejection", (reason, p) => {
 41             console.error("Unhandled Rejection at: Promise", p, "reason:", reason);
 42             browser.close();
 43             process.exit(1)
 44   });
 45 
 46   url = 'https://google.ca'
 47   console.log(`For testing, navigating to ${url}`);
 48   await page.goto(url);
 49   console.log(`Waiting for naviation to ${url}`);
 50   await page.waitForNavigation({waitUntil: 'load'});
 51   console.log(`Arrived at ${url}`);
 52 

输出为:

Starting Sun May 19 2019 22:23:18 GMT-0400 (EDT)
For testing, navigating to https://google.ca
Waiting for naviation to https://google.ca
Unhandled Rejection at: Promise Promise {
  <rejected> { TimeoutError: Navigation Timeout Exceeded: 50000ms exceeded
    at Promise.then (/home/jlam/code/sge/node_modules/puppeteer/lib/LifecycleWatcher.js:142:21)
    at <anonymous>
  -- ASYNC --
    at Frame.<anonymous> (/home/jlam/code/sge/node_modules/puppeteer/lib/helper.js:110:27)
    at Page.waitForNavigation (/home/jlam/code/sge/node_modules/puppeteer/lib/Page.js:649:49)
    at Page.<anonymous> (/home/jlam/code/sge/node_modules/puppeteer/lib/helper.js:111:23)
    at /home/jlam/code/sge/scripts/invoice_to_pdf.js:50:14
    at <anonymous>
    at process._tickCallback (internal/process/next_tick.js:189:7) name: 'TimeoutError' } } reason: { TimeoutError: Navigation Timeout Exceeded: 50000ms exceeded
    at Promise.then (/home/jlam/code/sge/node_modules/puppeteer/lib/LifecycleWatcher.js:142:21)
    at <anonymous>
  -- ASYNC --
    at Frame.<anonymous> (/home/jlam/code/sge/node_modules/puppeteer/lib/helper.js:110:27)
    at Page.waitForNavigation (/home/jlam/code/sge/node_modules/puppeteer/lib/Page.js:649:49)
    at Page.<anonymous> (/home/jlam/code/sge/node_modules/puppeteer/lib/helper.js:111:23)
    at /home/jlam/code/sge/scripts/invoice_to_pdf.js:50:14
    at <anonymous>
    at process._tickCallback (internal/process/next_tick.js:189:7) name: 'TimeoutError' }

运行npm updatenpm upgrade会得到相同的结果。 使用await page.waitForNavigation({waitUntil: 'domcontentloaded'});await page.waitForNavigation({waitUntil: 'networkidle2'})会得到相同的结果。

Npm列表说我有

lusk 22:27:59 $ npm list
...@1.0.0 /home/jlam/code/...
+-+ cli-debugger@0.0.2
| +-- v8-debugger@0.0.3
+-- json-query@2.2.2
+-+ mkdirp@0.5.1
| +-- minimist@0.0.8
+-+ puppeteer@1.16.0
  +-+ debug@4.1.1
  | +-- ms@2.1.1
  +-+ extract-zip@1.6.7
  | +-+ concat-stream@1.6.2
  | | +-- buffer-from@1.1.1
  | | +-- inherits@2.0.3
  | | +-+ readable-stream@2.3.6
  | | | +-- core-util-is@1.0.2
  | | | +-- inherits@2.0.3 deduped
  | | | +-- isarray@1.0.0
  | | | +-- process-nextick-args@2.0.0
  | | | +-- safe-buffer@5.1.2
  | | | +-+ string_decoder@1.1.1
  | | | | +-- safe-buffer@5.1.2 deduped
  | | | +-- util-deprecate@1.0.2
  | | +-- typedarray@0.0.6
  | +-+ debug@2.6.9
  | | +-- ms@2.0.0
  | +-- mkdirp@0.5.1 deduped
  | +-+ yauzl@2.4.1
  |   +-+ fd-slicer@1.0.1
  |     +-- pend@1.2.0
  +-+ https-proxy-agent@2.2.1
  | +-+ agent-base@4.2.1
  | | +-+ es6-promisify@5.0.0
  | |   +-- es6-promise@4.2.6
  | +-+ debug@3.2.6
  |   +-- ms@2.1.1 deduped
  +-- mime@2.4.3
  +-- progress@2.0.3
  +-- proxy-from-env@1.0.0
  +-+ rimraf@2.6.3
  | +-+ glob@7.1.4
  |   +-- fs.realpath@1.0.0
  |   +-+ inflight@1.0.6
  |   | +-- once@1.4.0 deduped
  |   | +-- wrappy@1.0.2
  |   +-- inherits@2.0.3 deduped
  |   +-+ minimatch@3.0.4
  |   | +-+ brace-expansion@1.1.11
  |   |   +-- balanced-match@1.0.0
  |   |   +-- concat-map@0.0.1
  |   +-+ once@1.4.0
  |   | +-- wrappy@1.0.2 deduped
  |   +-- path-is-absolute@1.0.1
  +-+ ws@6.2.1
    +-- async-limiter@1.0.0

Google可以从同一台计算机上正常加载

lusk 22:30:12 $ lynx --dump google.ca

   Search [1]Images [2]Maps [3]Play [4]YouTube [5]News [6]Gmail [7]Drive
   [8]More »
   [9]Web History | [10]Settings | [11]Sign in

   Google

     _______________________________________________________
   Google Search  I'm Feeling Lucky    [12]Advanced search
      [13]Language tools

   Google offered in: [14]Français

   [15]Advertising Programs     [16]Business Solutions     [17]About
   Google     [18]Google.com

                      © 2019 - [19]Privacy - [20]Terms

2 个答案:

答案 0 :(得分:0)

您可以尝试:

await page.waitForNavigation({waitUntil: 'networkidle2'})

代替

await page.waitForNavigation({waitUntil: 'load'})

答案 1 :(得分:0)

在此代码中:

 48   await page.goto(url);
 49   console.log(`Waiting for naviation to ${url}`);
 50   await page.waitForNavigation({waitUntil: 'load'});

...不需要waitForNavigation

 64   url = 'https://google.ca'
 65   console.log(`For testing, navigating to ${url}`);
 66   await page.goto(url);
 67   console.log(`Arrived at ${url}`);

输出:

For testing, navigating to https://google.ca
Arrived at https://google.ca

{await做等待goto完成的工作

page.click必须使用waitForNavigation。我还看到引用(123)必须在调用page.click之前创建:

80   logInAwait = page.waitForNavigation({waitUntil: ['networkidle0', 'load', 'domcontentloaded']});
 81   await page.click('[name="commit"]')
 82   console.log("logging in....");
 83
... 
 87   await logInAwait

非常欢迎反馈,因为我不认为自己是JS专家。