同时多个异步请求

时间:2018-10-27 11:29:20

标签: python python-3.x python-requests python-asyncio

我正在尝试同时调用300个API调用,这样我最多可以在几秒钟内得到结果。

我的伪代码如下:

> 11:14:56 Debug] IdentityServer4.Validation.TokenRequestValidator Start
> token request validation
> 
> [11:14:56 Debug] IdentityServer4.Validation.TokenRequestValidator
> Start validation of refresh token request
> 
> [11:14:56 Debug]
> IdentityServer4.EntityFramework.Stores.PersistedGrantStore
> MRNR65nTDUALsFTtuD6FKbzcHtXx9WB3xbclR+bdmJs= found in database: False
> 
> [11:14:56 Debug] IdentityServer4.Stores.DefaultRefreshTokenStore
> refresh_token grant with value:
> 386dd398df5b20566cc41befd44564221f999e0704b9c6d8ed5b3200a3e6b51e not
> found in store.
> 
> [11:14:56 Error] IdentityServer4.Validation.TokenValidator Invalid
> refresh token
> 
> [11:14:56 Error] IdentityServer4.Validation.TokenRequestValidator
> Refresh token validation failed. aborting.

这样做,我每秒def function_1(): colors = ['yellow', 'green', 'blue', + ~300 other ones] loop = asyncio.new_event_loop() asyncio.set_event_loop(loop) res = loop.run_until_complete(get_color_info(colors)) async def get_color_info(colors): loop = asyncio.get_event_loop() responses = [] for color in colors: print("getting color") url = "https://api.com/{}/".format(color) data = loop.run_in_executor(None, requests.get, url) r = await data responses.append(r.json()) return responses 都会被打印出来,并且代码要花很多时间,所以我很确定它们不会同时运行。我在做什么错了?

1 个答案:

答案 0 :(得分:8)

aiohttp与原生协同程序(async / await

这是一种典型的模式,可以完成您尝试做的事情。 (Python 3.7 +。)

一个重大变化是,您需要从为同步IO构建的requests移到专门为async /构建的诸如aiohttp之类的包中, await(原生协程):

import asyncio
import aiohttp  # pip install aiohttp aiodns


async def get(
    session: aiohttp.ClientSession,
    color: str,
    **kwargs
) -> dict:
    url = f"https://api.com/{color}/"
    print(f"Requesting {url}")
    resp = await session.request('GET', url=url, **kwargs)
    # Note that this may raise an exception for non-2xx responses
    # You can either handle that here, or pass the exception through
    data = await resp.json()
    print(f"Received data for {url}")
    return data


async def main(colors, **kwargs):
    # Asynchronous context manager.  Prefer this rather
    # than using a different session for each GET request
    async with aiohttp.ClientSession() as session:
        tasks = []
        for c in colors:
            tasks.append(get(session=session, color=c, **kwargs))
        # asyncio.gather() will wait on the entire task set to be
        # completed.  If you want to process results greedily as they come in,
        # loop over asyncio.as_completed()
        htmls = await asyncio.gather(*tasks, return_exceptions=True)
        return htmls


if __name__ == '__main__':
    colors = ['red', 'blue', 'green']  # ...
    # Either take colors from stdin or make some default here
    asyncio.run(main(colors))  # Python 3.7+

有两个不同的元素,一个是协程的异步方面,一个是在您指定任务容器(功能)时引入的并发性:

  • 您创建一个协程get,该协程将await与两个 awaitables 结合使用:第一个为.request,第二个为.json。这是异步方面。 await进行这些受IO限制的响应的目的是告诉事件循环,其他get()调用可以轮流通过同一例程运行。
  • 并发方面封装在await asyncio.gather(*tasks)中。它将等待的get()调用映射到您的每个colors。结果是返回值的汇总列表。请注意,该包装器将等到您的所有所有响应进入并调用.json()。或者,如果您希望在准备就绪时对其进行贪婪地处理,则可以遍历asyncio.as_completed:返回的每个Future对象都代表剩余的等待组中的最早结果。

最后,请注意asyncio.run()是Python 3.7中引入的高级“瓷”函数。在早期版本中,您可以大致模拟它:

# The "full" versions makes a new event loop and calls
# loop.shutdown_asyncgens(), see link above
loop = asyncio.get_event_loop()
try:
    loop.run_until_complete(main(colors))
finally:
    loop.close()

限制请求

有多种方法可以限制并发率。例如,请参见asyncio.semaphore in async-await functionlarge numbers of tasks with limited concurrency