Pyppeteer / Puppeteer NetworkError:执行上下文被破坏,很可能是由于导航

时间:2019-09-05 22:32:00

标签: python puppeteer pyppeteer

我正在使用puppeteer做一些轻量的爬行,大约2K页。但我仍然看到此错误再次发生

  File "/env/local/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 106, in evaluateHandle
    'userGesture': True,
pyppeteer.errors.NetworkError: Protocol error (Runtime.callFunctionOn): Cannot find context with specified id

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
...
  File "/user_code/main.py", line 434, in main_program
    crawl_data = asyncio.get_event_loop().run_until_complete(crawl(browser, url))
  File "/opt/python3.7/lib/python3.7/asyncio/base_events.py", line 573, in run_until_complete
    return future.result()
  File "/user_code/main.py", line 394, in crawl
    title = await page.title()
  File "/env/local/lib/python3.7/site-packages/pyppeteer/page.py", line 1437, in title
    return await frame.title()
  File "/env/local/lib/python3.7/site-packages/pyppeteer/frame_manager.py", line 752, in title
    return await self.evaluate('() => document.title')
  File "/env/local/lib/python3.7/site-packages/pyppeteer/frame_manager.py", line 295, in evaluate
    pageFunction, *args, force_expr=force_expr)
  File "/env/local/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 55, in evaluate
    pageFunction, *args, force_expr=force_expr)
  File "/env/local/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 109, in evaluateHandle
    _rewriteError(e)
  File "/env/local/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 238, in _rewriteError
    raise type(error)(msg)
pyppeteer.errors.NetworkError: Execution context was destroyed, most likely because of a navigation.
"  

我不明白它是如何触发与frame.title()相关的错误的,因为在我的代码中,它仅查找实际页面标题,而不在其框架内。

此外,它会在导航到所有框架内容之前调用页面标题:

    try:
        # max timeout of 8 seconds
        response = await page.goto(
            url,
            {'timeout': 12000}
        )
        if response.status != 200:
            await page.close()
            return(False)
    except TimeoutError:
        return(False)
    except Exception as e:
        print(e)
        return(False)

    # had this in before, but it was causing too many timeouts.  Error still persists
    #await page.waitForNavigation();

    try:
        source_code = await page.content()
    except:
        return(False)

    # title
    title = await page.title()
    title = title[:1000]

    # get all the frames    
    frames = page.frames
    content = ""
    for frame in frames:
        content_new = await frame.content();
        content += content_new

    await page.close()

此重复错误的可能原因是什么?

0 个答案:

没有答案