我需要对多个数据进行API请求,然后处理每个结果。请求已分页,所以我目前正在
def get_results():
while True:
response = api(num_results=5)
if response is None: # No more results
break
yield response
def process_data():
for page in get_results():
for result in page:
do_stuff(result)
process_data()
我希望在处理当前页面时使用asyncio从API检索下一页结果,而不是等待结果,处理它们然后再等待。我已将代码修改为
import asyncio
async def get_results():
while True:
response = api(num_results=5)
if response is None: # No more results
break
yield response
async def process_data():
async for page in get_results():
for result in page:
do_stuff(result)
asyncio.run(process_data())
我不确定这是否正在按照我的预期进行。是正确的方法来处理当前页面的API结果并异步获取结果的下一页吗?
答案 0 :(得分:0)
也许您可以使用Asyncio.Queue将代码重构为生产者/消费者模式
import asyncio
import random
q = asyncio.Queue()
async def api(num_results):
# you could use aiohttp to fetch api
# fake content
await asyncio.sleep(1)
fake_response = random.random()
if fake_response < 0.1:
return None
return fake_response
async def get_results(q):
while True:
response = await api(num_results=5)
if response is None:
# indicate producer done
print('Producer Done')
await q.put(None)
break
print('Producer: ', response)
await q.put(response)
async def process_data():
while True:
data = await q.get()
if not data:
print('Consumer Done')
break
# process data whatever you want, but if its cpu intensive, you can call loop.run_in_executor
# fake the process needs a little time
await asyncio.sleep(3)
print('Consume', data)
loop = asyncio.get_event_loop()
loop.create_task(get_results(q))
loop.run_until_complete(process_data())
回到问题
这是正确处理API结果的当前页面并异步获取下一页结果的正确方法吗?
这是不正确的方法,因为每次完成get_results()
时都会重复do_stuff(result)