我有一个程序,该程序创建一个类的多个实例Test
,然后对该类的每个实例进行一些工作,并跟踪工作花费了多少时间。我最近决定使用joblib
库并行化此代码,并遇到一个错误:最后的total_time
变量现在为0.0
。
我机器上的python环境是
$ python3
Python 3.7.0 (default, Sep 18 2018, 18:47:08)
[Clang 10.0.0 (clang-1000.10.43.1)] on darwin
以下是此问题的MCVE:
import time
import random
import multiprocessing
import joblib
class Test:
def __init__(self):
self.name = ""
self.duration = 0.0
def add_test(a):
temp = Test()
temp.name = str(a)
return temp
def run_test(test):
test_start = time.time()
rand = random.randint(1,3)
time.sleep(rand)
test_end = time.time()
test.duration = round(test_end - test_start, 3)
print(f"Test {test.name} ran in {test.duration}")
def main():
tests = []
for a in range(1,10):
tests.append(add_test(a))
num_cores = multiprocessing.cpu_count()
joblib.Parallel(n_jobs=num_cores)(joblib.delayed(run_test)(test) for test in tests)
total_time = round(sum(test.duration for test in tests), 3)
print(f"This run took {total_time} seconds.")
if __name__ == '__main__':
main()
如果我在print(list(test.duration for test in tests))
中添加main()
,则会看到在调用test.duration
之后0.0
是run_test()
。从运行以上输入可以看出,test.duration
在run_test()
内部被设置为非零值(适当时)。
我对python类或joblib
库不是很熟悉,所以我不确定我遇到的问题是否与滥用类或其他超出我的问题有关。 / p>
谢谢!
答案 0 :(得分:0)
感谢num8lock on Reddit,这是解决此问题的正确方法:
import time
import random
import multiprocessing
import joblib
class Test:
def __init__(self, name):
self.name = name
self.duration = 0.0
self.start = time.perf_counter()
def run(self):
rand = random.randint(1,3)
time.sleep(rand)
_end = time.perf_counter()
self.duration = _end - self.start
print(f"Test {self.name} ran in {self.duration}")
return self.duration
def add(a):
return Test(str(a))
def make_test(test):
return test.run()
def main():
num_cores = multiprocessing.cpu_count()
tests = []
for a in range(1,10):
tests.append(add(a))
jobs = joblib.Parallel(n_jobs=num_cores)(joblib.delayed(make_test)(t) for t in tests)
total_time = sum(job for job in jobs)
print(f"This run took {total_time} seconds.")
if __name__ == '__main__':
main()