我正在运行Mac OS X 10.6.8并使用Enthought Python Distribution。我想要numpy函数来利用我的核心。我遇到了类似于这篇文章的问题:multithreaded blas in python/numpy但是在完成该海报的步骤后,我仍然遇到同样的问题。这是我的numpy.show_config():
lapack_opt_info:
libraries = ['mkl_lapack95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
define_macros = [('SCIPY_MKL_H', None)]
include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
blas_opt_info:
libraries = ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
define_macros = [('SCIPY_MKL_H', None)]
include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
lapack_mkl_info:
libraries = ['mkl_lapack95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
define_macros = [('SCIPY_MKL_H', None)]
include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
blas_mkl_info:
libraries = ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
define_macros = [('SCIPY_MKL_H', None)]
include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
mkl_info:
libraries = ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
define_macros = [('SCIPY_MKL_H', None)]
include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
与原始帖子的评论一样,我删除了设置变量MKL_NUM_THREADS=1
的行。但即便如此,应该利用多线程的numpy和scipy函数一次只能使用我的一个核心。还有什么我应该改变的吗?
编辑:为了澄清,我试图让一个单独的计算如numpy.dot()根据MKL实现自己使用多线程,我不是试图利用numpy计算的事实释放对GIL的控制,从而使其他功能的多线程更容易。
这是一个小脚本,应该使用多线程但不在我的机器上:
import numpy as np
a = np.random.randn(1000, 10000)
b = np.random.randn(10000, 1000)
np.dot(a, b) #this line should be multi-threaded
答案 0 :(得分:7)
This article似乎意味着numpy智能地使某些操作并行,这取决于预期的操作加速:
根据numpy的启发式确定何时并行化特定的dot()调用,也许你的小(-ish)测试用例不会显示出显着的加速?也许尝试一个可笑的大型操作,看看是否使用了两个核心?
作为旁注,您的处理器/机器配置是否真的支持BLAS?