使用并发期货优化异步时遇到问题

时间:2019-07-14 04:13:00

标签: python-3.x python-asyncio concurrent.futures

我正在尝试在parallel.futures Threadpool的每个工作线程上同时运行asyncio任务。但是,我无法达到预期的结果。

async def say_after(delay, message):
    logging.info(f"{message} received")
    await asyncio.sleep(delay)
    logging.info(f"Printing {message}")

async def main():
    logging.info("Main started")
    await asyncio.gather(say_after(2, "TWO"), say_after(3, "THREE"))
    logging.info("Main Ended")

await main()

Output:

20:12:26:MainThread:Main started
20:12:26:MainThread:TWO received
20:12:26:MainThread:THREE received
20:12:28:MainThread:Printing TWO
20:12:29:MainThread:Printing THREE
20:12:29:MainThread:Main Ended

总结我对以上代码的理解,asyncio collect创建任务并在MainThread上运行的事件循环上注册它们。毫不奇怪,与同步代码相比,它节省了时间。

def say_after(delay, message):
    logging.info(f"{message} received")
    time.sleep(delay)
    logging.info(f"Printing {message}")

with cf.ThreadPoolExecutor(max_workers=3) as executor:
    results = [executor.submit(say_after, i+1, num_word_mapping[i+1]) for i in range(10)]

总结一下我的理解,cf线程池创建了三个线程,这些线程被操作系统抢先交换以实现并发性。

Output:

19:38:43:ThreadPoolExecutor-9_0:ONE received
19:38:43:ThreadPoolExecutor-9_1:TWO received
19:38:43:ThreadPoolExecutor-9_2:THREE received
19:38:44:ThreadPoolExecutor-9_0:Printing ONE
19:38:44:ThreadPoolExecutor-9_0:FOUR received
19:38:45:ThreadPoolExecutor-9_1:Printing TWO
19:38:45:ThreadPoolExecutor-9_1:FIVE received
19:38:46:ThreadPoolExecutor-9_2:Printing THREE
19:38:46:ThreadPoolExecutor-9_2:SIX received
19:38:48:ThreadPoolExecutor-9_0:Printing FOUR
19:38:48:ThreadPoolExecutor-9_0:SEVEN received
19:38:50:ThreadPoolExecutor-9_1:Printing FIVE
19:38:50:ThreadPoolExecutor-9_1:EIGHT received
19:38:52:ThreadPoolExecutor-9_2:Printing SIX
19:38:52:ThreadPoolExecutor-9_2:NINE received
19:38:55:ThreadPoolExecutor-9_0:Printing SEVEN
19:38:55:ThreadPoolExecutor-9_0:TEN received
19:38:58:ThreadPoolExecutor-9_1:Printing EIGHT
19:39:01:ThreadPoolExecutor-9_2:Printing NINE
19:39:05:ThreadPoolExecutor-9_0:Printing TEN

现在,我想在每个工作线程上运行一个包含多个任务的事件循环。我尝试使用下面的代码,但它没有缩短执行时间。

def say_after(delay, message):
    logging.info(f"{message} received")
    time.sleep(delay)
    logging.info(f"Printing {message}")

async def parallel(executor, delay, message):
    loop = asyncio.get_running_loop()
    loop.run_in_executor(executor, say_after, delay, message) 

async def main():
    executor = cf.ThreadPoolExecutor(max_workers=3)
    await asyncio.gather(*[parallel(executor, i+1, num_word_mapping[i+1])  for i in range(10)])

await main()

Output:

20:57:04:ThreadPoolExecutor-19_0:ONE received
20:57:04:ThreadPoolExecutor-19_1:TWO received
20:57:04:ThreadPoolExecutor-19_2:THREE received
20:57:05:ThreadPoolExecutor-19_0:Printing ONE
20:57:05:ThreadPoolExecutor-19_0:FOUR received
20:57:06:ThreadPoolExecutor-19_1:Printing TWO
20:57:06:ThreadPoolExecutor-19_1:FIVE received
20:57:07:ThreadPoolExecutor-19_2:Printing THREE
20:57:07:ThreadPoolExecutor-19_2:SIX received
20:57:09:ThreadPoolExecutor-19_0:Printing FOUR
20:57:09:ThreadPoolExecutor-19_0:SEVEN received
20:57:11:ThreadPoolExecutor-19_1:Printing FIVE
20:57:11:ThreadPoolExecutor-19_1:EIGHT received
20:57:13:ThreadPoolExecutor-19_2:Printing SIX
20:57:13:ThreadPoolExecutor-19_2:NINE received
20:57:16:ThreadPoolExecutor-19_0:Printing SEVEN
20:57:16:ThreadPoolExecutor-19_0:TEN received
20:57:19:ThreadPoolExecutor-19_1:Printing EIGHT
20:57:22:ThreadPoolExecutor-19_2:Printing NINE
20:57:26:ThreadPoolExecutor-19_0:Printing TEN

我希望在代码4中看到更快的执行时间。但是,我不确定我是否以正确的方式进行操作。

Environment: Python 3.7(Jupyter Notebook)

1 个答案:

答案 0 :(得分:0)

  

现在,我想在每个工作线程上运行一个包含多个任务的事件循环。

并发工作程序与事件循环完全分开。每个池由许多工人组成,每个工人可以在任何给定时间执行一项的工作。此功能由concurrent.futures模块提供,并且与asyncio完全正交。

因此,当您使用run_in_executor 访问线程池时,没有理由使代码神奇地变得更快。毕竟,您仍然像以前一样在3个工作线程上执行10个任务。唯一添加的值run_in_executor是现在您可以在异步事件循环中await那些工作者了。

要加快代码的速度,您需要增加工作人员的数量,或者完全停止使用run_in_executor并开始使用异步功能,如第一个示例所示。