在Numpy
环境中,Multiprocessing
和qsub
遇到了问题。
具体地说,我有以下Python代码:
#full_comparisons.py
import numpy as np
import multiprocessing
output = np.ndarray(
shape=(x, y, z, a),
dtype=[('site', '>i4'), ('html', '>f4'), ('js', '>f4'), ('png', '>f4')])
##NOTE: output size is only .002 GB, so RAM shouldn't be an issue.
print("Before pool")
pool = multiprocessing.Pool()
print("After pool")
我已经按照以下方式运行qsub
(即,我已经尝试了其中的每一个),其中./comparisons
仅调用了python3 full_comparisons.py
:
qsub -V comparisons # -V keep environment variables
qsub -l vlong -V comparisons #-l vlong lets it run infinitely
qsub -V -pe smp 32 comparisons #parallelizes with more processors
qsub -l vlong -V -pe smp 32
qsub -V -pe smp 16 comparisons
qsub -V -pe smp 8 comparisons
还有其他人。
在每种情况下,我都打印Before pool
,然后挂起。
我认为这与集群有关,是因为运行./comparisons
在本地可以很好地进行多处理。唯一的问题来自使用qsub
。也许有一个错误会影响我不了解的Numpy
和Multiprocessing
的使用。
所有相关代码:
import subprocess
import os
import csv
import itertools
import multiprocessing
import numpy as np
import jaccard
import file_names
def compare_lambda(x, y, dict_1, dict_2):
...
def compare_all():
pairs = itertools.combinations(range(GLOBAL_VAR1), 2)
ids_to_sites, sites_to_ids = init_sites()
output = np.ndarray(
shape=(GLOBAL_VAR1, GLOBAL_VAR1, GLOBAL_VAR2, GLOBAL_VAR3),
dtype=[('x', '>i4'), ('y', '>f4'), ('z', '>f4'), ('a', '>f4')])
print("Before pool")
pool = multiprocessing.Pool()
print("After pool")
compared_vals = pool.starmap(compare_lambda, list(map(lambda x: (x[0], x[1], dict_1, dict_2), pairs)))
for (a, b, compared) in compared_vals:
...
print(multiprocessing.cpu_count()) #works fine
compare_all()
编辑:在@sehafoc的建议下,我启用了用于多处理的日志记录。有趣的是,当我在计算集群上运行多处理程序时,我有以下内容:
Before pool
[DEBUG/MainProcess] created semlock with handle 47690244722688
[DEBUG/MainProcess] created semlock with handle 47690244726784
[DEBUG/MainProcess] created semlock with handle 47690244730880
[DEBUG/MainProcess] created semlock with handle 47690244734976
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-1] child process calling self.run()
[INFO/ForkPoolWorker-2] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-4] child process calling self.run()
[INFO/ForkPoolWorker-3] child process calling self.run()
当我在本地运行它时,输出如下:
Before pool
[DEBUG/MainProcess] created semlock with handle 140313792987136
[DEBUG/MainProcess] created semlock with handle 140313792983040
[DEBUG/MainProcess] created semlock with handle 140313792978944
[DEBUG/MainProcess] created semlock with handle 140313792974848
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-1] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-2] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-3] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-4] child process calling self.run()
After pool
答案 0 :(得分:0)
更新:强制sys.out.flush
将其打印。似乎qsub
很少刷新。