将我的脚趾插入Python中的异步编程,然后遇到一个有趣的应用程序,我需要在大约100台机器上收集大约10个文件的文件大小,以查看哪些机器没有正确清除它们的日志文件。
我的同步方法是:
File_info = namedtuple("File_info", "machineinfo size")
machines = utils.list_machines() # the computers being queried
# each machine object has attributes like "name", "IP", and "site_id" among others
file_sizes = {}
# {filename: [File_info, ...], ...}
for m in machines:
print(f"Processing {m}...") # this is "Processing {m}...".format(m=m)
# isn't Python 3.6 awesome?!
for path in glob.glob(f"//{m.IP}/somedir/*.dbf"):
fname = os.path.split(path)[-1].lower()
machineinfo = (m.site_id, m.name)
size = os.stat(path).st_size
file_sizes.setdefault(fname, []).append(File_info(registerinfo, size))
这很好用,但需要花费很长时间才能完成所有这些全局和统计数据的网络操作。我想使用Python 3.5的async / await语法和asyncio来异步调用这些调用。这就是我想出的:
File_info = namedtuple("File_info", "machineinfo size")
machines = utils.list_machines()
file_sizes = {}
# {filename: [File_info, ...], ...}
async def getfilesizes(machine, loop):
machineinfo = machine.site_id, machine.name
paths = glob.glob(f"//{machine.IP}/somedir/*.dbf")
coros = [getsize(path) for path in paths]
results = loop.run_until_complete(asyncio.gather(*coros))
sizes = {fname: File_info(machineinfo, size) for (fname, size) in results}
return sizes
async def getsize(path):
return os.path.split(path)[-1], os.stat(path).st_size
loop = asyncio.get_event_loop()
results = loop.run_until_complete(asyncio.gather(*(getfilesizes(m, loop) for m in machines)))
for result in results:
file_sizes.update(result)
# I have a problem here since my dict values are lists that need to extend
# not overwrite, but that's not relevant for the error I'm getting
但是,脚本会挂在外部loop.run_until_complete
部分内。我做错了什么?
答案 0 :(得分:0)
想要运行另一个协同程序的协程(getfilesizes
与getsize
一起运行)应该await
而不是在事件循环中安排它。
...
async def getfilesizes(machine): # changed func sig
machineinfo = machine.site_id, machine.name
paths = glob.glob(f"//{machine.IP}/somedir/*.dbf")
coros = [getsize(path) for path in paths]
results = await asyncio.gather(*coros) # await the results instead!
sizes = {fname: File_info(machineinfo, size) for (fname, size) in results}
return sizes
...
由于asyncio.gather
从任意数量的协同程序创建一个Future,因此await
在函数上对整个协程组起作用,并立即抓取所有结果。