我有一个函数download_all,它遍历一个硬编码的页面列表,按顺序下载它们。但是如果我想根据页面的结果动态添加到列表中,我该怎么办呢?例如,下载第一页,解析它,并根据结果将其他页面添加到事件循环中。
@asyncio.coroutine
def download_all():
first_page = 1
last_page = 100
download_list = [download(page_number) for page_number in range(first_page, last_page)]
gen = asyncio.wait(download_list)
return gen
if __name__ == '__main__':
loop = asyncio.get_event_loop()
futures = loop.run_until_complete(download_all())
答案 0 :(得分:8)
实现此目的的一种方法是使用队列。
#!/usr/bin/python3
import asyncio
try:
# python 3.4
from asyncio import JoinableQueue as Queue
except:
# python 3.5
from asyncio import Queue
@asyncio.coroutine
def do_work(task_name, work_queue):
while not work_queue.empty():
queue_item = work_queue.get_nowait()
# simulate condition where task is added dynamically
if queue_item % 2 != 0:
work_queue.put_nowait(2)
print('Added additional item to queue')
print('{0} got item: {1}'.format(task_name, queue_item))
yield from asyncio.sleep(queue_item)
print('{0} finished processing item: {1}'.format(task_name, queue_item))
if __name__ == '__main__':
queue = Queue()
# Load initial jobs into queue
[queue.put_nowait(x) for x in range(1, 6)]
# use 3 workers to consume tasks
taskers = [
do_work('task1', queue),
do_work('task2', queue),
do_work('task3', queue)
]
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait(taskers))
loop.close()
使用来自asyncio的队列,您可以确保"单位"工作与最初给予asyncio事件循环的任务/期货是分开的。基本上这允许添加额外的"单位"工作有某些条件。
请注意,在上面的示例中,偶数编号的任务是终端,因此如果是这种情况,则不会添加其他任务。这最终会导致所有任务完成,但在您的情况下,您可以轻松使用其他条件来确定是否将其他项添加到队列中。
输出:
Added additional item to queue
task2 got item: 1
task1 got item: 2
Added additional item to queue
task3 got item: 3
task2 finished processing item: 1
task2 got item: 4
task1 finished processing item: 2
Added additional item to queue
task1 got item: 5
task3 finished processing item: 3
task3 got item: 2
task3 finished processing item: 2
task3 got item: 2
task2 finished processing item: 4
task2 got item: 2
task1 finished processing item: 5
task3 finished processing item: 2
task2 finished processing item: 2
答案 1 :(得分:0)
它使用asyncio.JoinableQueue
队列来存储获取任务的URL,但也展示了许多有用的技术。