异步协程似乎没有完成

时间:2017-04-14 00:00:12

标签: python asynchronous python-3.6

将我的脚趾插入Python中的异步编程,然后遇到一个有趣的应用程序,我需要在大约100台机器上收集大约10个文件的文件大小,以查看哪些机器没有正确清除它们的日志文件。

我的同步方法是:

File_info = namedtuple("File_info", "machineinfo size")

machines = utils.list_machines()  # the computers being queried
# each machine object has attributes like "name", "IP", and "site_id" among others

file_sizes = {}
# {filename: [File_info, ...], ...}

for m in machines:
    print(f"Processing {m}...")  # this is "Processing {m}...".format(m=m)
                                 # isn't Python 3.6 awesome?!
    for path in glob.glob(f"//{m.IP}/somedir/*.dbf"):
        fname = os.path.split(path)[-1].lower()
        machineinfo = (m.site_id, m.name)
        size = os.stat(path).st_size
        file_sizes.setdefault(fname, []).append(File_info(registerinfo, size))

这很好用,但需要花费很长时间才能完成所有这些全局和统计数据的网络操作。我想使用Python 3.5的async / await语法和asyncio来异步调用这些调用。这就是我想出的:

File_info = namedtuple("File_info", "machineinfo size")

machines = utils.list_machines()

file_sizes = {}
# {filename: [File_info, ...], ...}

async def getfilesizes(machine, loop):
    machineinfo = machine.site_id, machine.name
    paths = glob.glob(f"//{machine.IP}/somedir/*.dbf")
    coros = [getsize(path) for path in paths]
    results = loop.run_until_complete(asyncio.gather(*coros))
    sizes = {fname: File_info(machineinfo, size) for (fname, size) in results}
    return sizes

async def getsize(path):
    return os.path.split(path)[-1], os.stat(path).st_size

loop = asyncio.get_event_loop()
results = loop.run_until_complete(asyncio.gather(*(getfilesizes(m, loop) for m in machines)))
for result in results:
    file_sizes.update(result)
    # I have a problem here since my dict values are lists that need to extend
    # not overwrite, but that's not relevant for the error I'm getting

但是,脚本会挂在外部loop.run_until_complete部分内。我做错了什么?

1 个答案:

答案 0 :(得分:0)

想要运行另一个协同程序的协程(getfilesizesgetsize一起运行)应该await而不是在事件循环中安排它。

...

async def getfilesizes(machine):  # changed func sig
    machineinfo = machine.site_id, machine.name
    paths = glob.glob(f"//{machine.IP}/somedir/*.dbf")
    coros = [getsize(path) for path in paths]
    results = await asyncio.gather(*coros)  # await the results instead!
    sizes = {fname: File_info(machineinfo, size) for (fname, size) in results}
    return sizes

...

由于asyncio.gather从任意数量的协同程序创建一个Future,因此await在函数上对整个协程组起作用,并立即抓取所有结果。