如何使用Playwright Python异步打开多个页面?

时间:2020-11-03 14:08:29

标签: python web-scraping webautomation playwright playwright-python

我想使用Playwright for Python一次打开多个URL。但是我在努力寻找方法。这来自异步文档:

async def main():
    async with async_playwright() as p:
        for browser_type in [p.chromium, p.firefox, p.webkit]:
            browser = await browser_type.launch()
            page = await browser.newPage()
            await page.goto("https://scrapingant.com/")
            await page.screenshot(path=f"scrapingant-{browser_type.name}.png")
            await browser.close()

asyncio.get_event_loop().run_until_complete(main())

这将依次打开每个browser_type。如果要并行执行该怎么办?如果我想对网址列表执行类似的操作,该怎么办?

我尝试这样做:

urls = [
    "https://scrapethissite.com/pages/ajax-javascript/#2015",
    "https://scrapethissite.com/pages/ajax-javascript/#2014",
]
async def main(url):
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=False)
        page = await browser.newPage()
        await page.goto(url)
        await browser.close()

async def go_to_url():
    tasks = [main(url) for url in urls]
    await asyncio.wait(tasks)

go_to_url()

但这给了我以下错误:

92: RuntimeWarning: coroutine 'go_to_url' was never awaited
  go_to_url()
RuntimeWarning: Enable tracemalloc to get the object allocation traceback

1 个答案:

答案 0 :(得分:1)

我相信您需要使用相同的配方来调用go_to_url函数:

asyncio.get_event_loop().run_until_complete(go_to_url())