动态添加到Python asyncio事件循环应该执行的列表

时间:2015-01-23 17:18:19

标签: python asynchronous coroutine python-asyncio

我有一个函数download_all,它遍历一个硬编码的页面列表,按顺序下载它们。但是如果我想根据页面的结果动态添加到列表中,我该怎么办呢?例如,下载第一页,解析它,并根据结果将其他页面添加到事件循环中。

@asyncio.coroutine
def download_all():
    first_page = 1
    last_page = 100
    download_list = [download(page_number) for page_number in range(first_page, last_page)]
    gen = asyncio.wait(download_list)
    return gen

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    futures = loop.run_until_complete(download_all())

2 个答案:

答案 0 :(得分:8)

实现此目的的一种方法是使用队列。

#!/usr/bin/python3

import asyncio

try:  
    # python 3.4
    from asyncio import JoinableQueue as Queue
except:  
    # python 3.5
    from asyncio import Queue

@asyncio.coroutine
def do_work(task_name, work_queue):
    while not work_queue.empty():
        queue_item = work_queue.get_nowait()

        # simulate condition where task is added dynamically
        if queue_item % 2 != 0:
            work_queue.put_nowait(2)
            print('Added additional item to queue')

        print('{0} got item: {1}'.format(task_name, queue_item))
        yield from asyncio.sleep(queue_item)
        print('{0} finished processing item: {1}'.format(task_name, queue_item))

if __name__ == '__main__':

    queue = Queue()

    # Load initial jobs into queue
    [queue.put_nowait(x) for x in range(1, 6)] 

    # use 3 workers to consume tasks
    taskers = [ 
        do_work('task1', queue),
        do_work('task2', queue),
        do_work('task3', queue)
    ]   

    loop = asyncio.get_event_loop()
    loop.run_until_complete(asyncio.wait(taskers))
    loop.close()

使用来自asyncio的队列,您可以确保"单位"工作与最初给予asyncio事件循环的任务/期货是分开的。基本上这允许添加额外的"单位"工作有某些条件。

请注意,在上面的示例中,偶数编号的任务是终端,因此如果是这种情况,则不会添加其他任务。这最终会导致所有任务完成,但在您的情况下,您可以轻松使用其他条件来确定是否将其他项添加到队列中。

输出:

Added additional item to queue
task2 got item: 1
task1 got item: 2
Added additional item to queue
task3 got item: 3
task2 finished processing item: 1
task2 got item: 4
task1 finished processing item: 2
Added additional item to queue
task1 got item: 5
task3 finished processing item: 3
task3 got item: 2
task3 finished processing item: 2
task3 got item: 2
task2 finished processing item: 4
task2 got item: 2
task1 finished processing item: 5
task3 finished processing item: 2
task2 finished processing item: 2

答案 1 :(得分:0)

请查看Web Crawler example

它使用asyncio.JoinableQueue队列来存储获取任务的URL,但也展示了许多有用的技术。