我想使用numpy.fft.fft
和multiprocessing
并行计算一组ffts。不幸的是,并行运行ffts会导致较大的内核负载。
以下是重现问题的最小示例:
# fft_test.py
import numpy as np
import multiprocessing
from argparse import ArgumentParser
def f(i):
x = np.empty(1000000)
np.fft.fft(x)
return i
def __main__():
ap = ArgumentParser('fft_test')
ap.add_argument('--single_core', '-s', action='store_true', help='use only a single core')
args = ap.parse_args()
# Show the configuration
print("number of cores: %d" % multiprocessing.cpu_count())
np.__config__.show()
# Execute using a single core
if args.single_core:
for i in range(multiprocessing.cpu_count()):
f(i)
print(i, end=' ')
# Execute using all cores
else:
pool = multiprocessing.Pool()
for i in pool.map(f, range(multiprocessing.cpu_count())):
print(i, end=' ')
if __name__ == '__main__':
__main__()
运行time python fft_test.py
会给我以下结果:
number of cores: 48
openblas_info:
library_dirs = ['/home/till/anaconda2/envs/sonalytic/lib']
define_macros = [('HAVE_CBLAS', None)]
libraries = ['openblas', 'openblas']
language = c
openblas_lapack_info:
library_dirs = ['/home/till/anaconda2/envs/sonalytic/lib']
define_macros = [('HAVE_CBLAS', None)]
libraries = ['openblas', 'openblas']
language = c
blas_opt_info:
library_dirs = ['/home/till/anaconda2/envs/sonalytic/lib']
define_macros = [('HAVE_CBLAS', None)]
libraries = ['openblas', 'openblas']
language = c
blas_mkl_info:
NOT AVAILABLE
lapack_opt_info:
library_dirs = ['/home/till/anaconda2/envs/sonalytic/lib']
define_macros = [('HAVE_CBLAS', None)]
libraries = ['openblas', 'openblas']
language = c
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
real 0m7.422s
user 0m9.830s
sys 1m26.603s
使用单核运行,即python fft_test.py -s
给出
real 1m0.345s
user 0m56.558s
sys 0m2.959s
知道什么可能导致大内核等待吗?