为什么使用asyncio.ensure_future进行长时间工作而不是等待运行这么快?

时间:2016-12-23 15:02:45

标签: python-asyncio

我正在从api下载jsons并使用asyncio模块。我的问题的关键是,以下事件循环实现如下:

loop = asyncio.get_event_loop()
main_task = asyncio.ensure_future( klass.download_all() )
loop.run_until_complete( main_task )

download_all()实现了类的这个实例方法,它已经创建并可用的下载程序对象,因此调用每个相应的download方法:

async def download_all(self):
    """ Builds the coroutines, uses asyncio.wait, then sifts for those still pending, loops """
    ret = []
    async with aiohttp.ClientSession() as session:
        pending = []

        for downloader in self._downloaders:
            pending.append( asyncio.ensure_future( downloader.download(session) ) )

        while pending:
            dne, pnding= await asyncio.wait(pending)
            ret.extend( [d.result() for d in dne] )

            # Get all the tasks, cannot use "pnding"
            tasks = asyncio.Task.all_tasks()
            pending = [tks for tks in tasks if not tks.done()]
            # Exclude the one that we know hasn't ended yet (UGLY)
            pending = [t for t in pending if not t._coro.__name__ == self.download_all.__name__]

    return ret

为什么,在下载程序的download方法中,当我选择await而不是asyncio.ensure_future语法时,它会更快地运行,这看起来更像正如我从日志中看到的那样,“异步”。

这是有效的,因为我设置检测所有仍处于待处理状态的任务,并且不让download_all方法完成,并继续调用asyncio.wait

我认为await关键字允许事件循环机制完成其工作并有效地共享资源?怎么这样做更快?它有什么问题吗?例如:

async def download(self, session):
    async with session.request(self.method, self.url, params=self.params) as response:
        response_json = await response.json()

    # Not using await here, as I am "supposed" to
    asyncio.ensure_future( self.write(response_json, self.path) ) 
    return response_json

async def write(self, res_json, path):
    # using aiofiles to write, but it doesn't (seem to?) support direct json
    # so converting to raw text first
    txt_contents = json.dumps(res_json, **self.json_dumps_kwargs);
    async with aiofiles.open(path, 'w') as f:
        await f.write(txt_contents)

通过完整的代码实现和真正的API,我能够在34秒内下载44个资源,但是当使用await时花了超过三分钟(实际上我已经放弃了这么长时间)。

1 个答案:

答案 0 :(得分:0)

当你在await循环的每次迭代中执行for时,它将等待下载每次迭代。
另一方面当你执行ensure_future时,它不会创建下载所有文件的任务,然后在第二次循环中等待所有文件。