关于让numpy
使用多核(在英特尔硬件上)用于内部和外部矢量产品,矢量矩阵乘法等方面的最新技术是什么?
我很乐意在必要时重建numpy
,但此时我正在研究如何在不更改代码的情况下加快速度。
作为参考,我的show_config()
如下,我从未注意到numpy
使用多个核心:
atlas_threads_info:
libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas']
library_dirs = ['/usr/local/atlas-3.9.16/lib']
language = f77
include_dirs = ['/usr/local/atlas-3.9.16/include']
blas_opt_info:
libraries = ['ptf77blas', 'ptcblas', 'atlas']
library_dirs = ['/usr/local/atlas-3.9.16/lib']
define_macros = [('ATLAS_INFO', '"\\"3.9.16\\""')]
language = c
include_dirs = ['/usr/local/atlas-3.9.16/include']
atlas_blas_threads_info:
libraries = ['ptf77blas', 'ptcblas', 'atlas']
library_dirs = ['/usr/local/atlas-3.9.16/lib']
language = c
include_dirs = ['/usr/local/atlas-3.9.16/include']
lapack_opt_info:
libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas']
library_dirs = ['/usr/local/atlas-3.9.16/lib']
define_macros = [('ATLAS_INFO', '"\\"3.9.16\\""')]
language = f77
include_dirs = ['/usr/local/atlas-3.9.16/include']
lapack_mkl_info:
NOT AVAILABLE
blas_mkl_info:
NOT AVAILABLE
mkl_info:
NOT AVAILABLE
答案 0 :(得分:7)
您应该首先检查Numpy正在使用的Atlas构建是否已使用多线程构建。您可以构建并运行它来检查Atlas配置(直接来自Atlas FAQ):
main()
/*
* Compile, link and run with something like:
* gcc -o xprint_buildinfo -L[ATLAS lib dir] -latlas ; ./xprint_buildinfo
* if link fails, you are using ATLAS version older than 3.3.6.
*/
{
void ATL_buildinfo(void);
ATL_buildinfo();
exit(0);
}
如果您没有Atlas的多线程版本:“那就是您的问题”。如果它是多线程的,那么你需要运用一个多线程BLAS3例程(可能是dgemm),使用一个适当大的矩阵矩阵产品,看看是否使用了线程。我认为我说得对,Atlas中的BLAS 2和BLAS 1例程都不支持多线程(并且有充分的理由因为除了真正巨大的问题大小之外没有性能优势)。