Question

我正在从api下载jsons并使用asyncio模块。我的问题的关键是，以下事件循环实现如下：

loop = asyncio.get_event_loop()
main_task = asyncio.ensure_future( klass.download_all() )
loop.run_until_complete( main_task )

和download_all()实现了类的这个实例方法，它已经创建并可用的下载程序对象，因此调用每个相应的download方法：

async def download_all(self):
    """ Builds the coroutines, uses asyncio.wait, then sifts for those still pending, loops """
    ret = []
    async with aiohttp.ClientSession() as session:
        pending = []

        for downloader in self._downloaders:
            pending.append( asyncio.ensure_future( downloader.download(session) ) )

        while pending:
            dne, pnding= await asyncio.wait(pending)
            ret.extend( [d.result() for d in dne] )

            # Get all the tasks, cannot use "pnding"
            tasks = asyncio.Task.all_tasks()
            pending = [tks for tks in tasks if not tks.done()]
            # Exclude the one that we know hasn't ended yet (UGLY)
            pending = [t for t in pending if not t._coro.__name__ == self.download_all.__name__]

    return ret

为什么，在下载程序的download方法中，当我选择await而不是asyncio.ensure_future语法时，它会更快地运行，这看起来更像正如我从日志中看到的那样，“异步”。

这是有效的，因为我设置检测所有仍处于待处理状态的任务，并且不让download_all方法完成，并继续调用asyncio.wait。

我认为await关键字允许事件循环机制完成其工作并有效地共享资源？怎么这样做更快？它有什么问题吗？例如：

async def download(self, session):
    async with session.request(self.method, self.url, params=self.params) as response:
        response_json = await response.json()

    # Not using await here, as I am "supposed" to
    asyncio.ensure_future( self.write(response_json, self.path) ) 
    return response_json

async def write(self, res_json, path):
    # using aiofiles to write, but it doesn't (seem to?) support direct json
    # so converting to raw text first
    txt_contents = json.dumps(res_json, **self.json_dumps_kwargs);
    async with aiofiles.open(path, 'w') as f:
        await f.write(txt_contents)

通过完整的代码实现和真正的API，我能够在34秒内下载44个资源，但是当使用await时花了超过三分钟（实际上我已经放弃了这么长时间）。

Answer 1

当你在await循环的每次迭代中执行for时，它将等待下载每次迭代。
另一方面当你执行ensure_future时，它不会创建下载所有文件的任务，然后在第二次循环中等待所有文件。

为什么使用asyncio.ensure_future进行长时间工作而不是等待运行这么快？

1 个答案: