假设自动线程scipy和numpy函数没有使用多个内核

时间:2012-08-03 00:23:22

标签: python multithreading numpy scipy intel-mkl

我正在运行Mac OS X 10.6.8并使用Enthought Python Distribution。我想要numpy函数来利用我的核心。我遇到了类似于这篇文章的问题:multithreaded blas in python/numpy但是在完成该海报的步骤后,我仍然遇到同样的问题。这是我的numpy.show_config():

lapack_opt_info:
    libraries = ['mkl_lapack95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
    library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
    define_macros = [('SCIPY_MKL_H', None)]
    include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
blas_opt_info:
    libraries = ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
    library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
    define_macros = [('SCIPY_MKL_H', None)]
    include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
lapack_mkl_info:
    libraries = ['mkl_lapack95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
    library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
    define_macros = [('SCIPY_MKL_H', None)]
    include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
blas_mkl_info:
    libraries = ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
    library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
    define_macros = [('SCIPY_MKL_H', None)]
    include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
mkl_info:
    libraries = ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
    library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
    define_macros = [('SCIPY_MKL_H', None)]
    include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']

与原始帖子的评论一样,我删除了设置变量MKL_NUM_THREADS=1的行。但即便如此,应该利用多线程的numpy和scipy函数一次只能使用我的一个核心。还有什么我应该改变的吗?

编辑:为了澄清,我试图让一个单独的计算如numpy.dot()根据MKL实现自己使用多线程,我不是试图利用numpy计算的事实释放对GIL的控制,从而使其他功能的多线程更容易。

这是一个小脚本,应该使用多线程但不在我的机器上:

import numpy as np

a = np.random.randn(1000, 10000)
b = np.random.randn(10000, 1000)

np.dot(a, b) #this line should be multi-threaded

1 个答案:

答案 0 :(得分:7)

This article似乎意味着numpy智能地使某些操作并行,这取决于预期的操作加速:

  • “如果您的numpy / scipy是使用其中一个编译的,那么dot()将并行计算(如果这样更快),而不做任何事情。”

根据numpy的启发式确定何时并行化特定的dot()调用,也许你的小(-ish)测试用例不会显示出显着的加速?也许尝试一个可笑的大型操作,看看是否使用了两个核心?

作为旁注,您的处理器/机器配置是否真的支持BLAS?