我使用iPython的并行处理工具进行大型地图操作。在等待地图操作完成时,我想向用户显示已完成的作业数量,正在运行的作业数量以及剩余的数量。我怎样才能找到这些信息?
这是我的工作。我创建一个使用本地引擎的配置文件并启动两个工作程序。在shell中:
$ ipython profile create --parallel --profile=local
$ ipcluster start --n=2 --profile=local
这是客户端Python脚本:
#!/usr/bin/env python
def meat(i):
import numpy as np
import time
import sys
seconds = np.random.randint(2, 15)
time.sleep(seconds)
return seconds
import time
from IPython.parallel import Client
c = Client(profile='local')
dview = c[:]
ar = dview.map_async(meat, range(4))
elapsed = 0
while True:
print 'After %d s: %d running' % (elapsed, len(c.outstanding))
if ar.ready():
break
time.sleep(1)
elapsed += 1
print ar.get()
脚本的示例输出:
After 0 s: 2 running
After 1 s: 2 running
After 2 s: 2 running
After 3 s: 2 running
After 4 s: 2 running
After 5 s: 2 running
After 6 s: 2 running
After 7 s: 2 running
After 8 s: 2 running
After 9 s: 2 running
After 10 s: 2 running
After 11 s: 2 running
After 12 s: 2 running
After 13 s: 2 running
After 14 s: 1 running
After 15 s: 1 running
After 16 s: 1 running
After 17 s: 1 running
After 18 s: 1 running
After 19 s: 1 running
After 20 s: 1 running
After 21 s: 1 running
After 22 s: 1 running
After 23 s: 1 running
[9, 14, 10, 3]
如您所见,我可以获得当前正在运行的作业的数量,但不能获得已完成(或剩余)作业的数量。我怎么知道map_async
的工作完成了多少?
答案 0 :(得分:3)
AsyncResult具有msg_ids
属性。杰出的工作是与rc.outstanding的交叉,完成的工作是不同的:
msgset = set(ar.msg_ids)
completed = msgset.difference(rc.outstanding)
pending = msgset.intersection(rc.outstanding)