Question

我正在进行一些性能分析，我想知道，当数据类型已知（双精度）时，numpy是否向其标准数组操作进行矢量化。

a, b = (some numpy arrays)
c = a + b #Is this vectorized?

编辑：这个操作是否被矢量化，即计算是否由SIMD操作组成？

Answer 1

是的，他们是。

/*
 * This file is for the definitions of simd vectorized operations.
 *
 * Currently contains sse2 functions that are built on amd64, x32 or
 * non-generic builds (CFLAGS=-march=...)
 * In future it may contain other instruction sets like AVX or NEON     detected
 * at runtime in which case it needs to be included indirectly via a file
 * compiled with special options (or use gcc target attributes) so the binary
 * stays portable.
 */

链接：Numpy simd.inc.src on github.

Answer 2

我注意到 Quazi Irfan 对 henrikstroem 的回答发表了评论，其中说 numpy 没有利用矢量化，并引用了作者通过实验进行“证明”的博客。

所以我翻阅了博客，发现有一个差距可能会得出不同的结论：对于numpy-array a和b，算术a*b与np.dot(a,b)不同。算术(a *b) 博客作者测试的只是标量乘法，不是矩阵乘法(np.dot(a,b))，甚至不是向量内积。但作者仍然使用a*b与原始实验进行比较运行 np.dot(a,b)。这两个算法的复杂度大不相同！

numpy 肯定利用了 SIMD 和 BLAS 向量化的漏洞，这可以在其源代码中找到。官方 numpy 发行版支持一组并行操作（如 np.dot），但不是所有函数（如 np.where,np.意思是）博客作者可能会选择不合适的函数（非向量化函数）进行比较。

我们还可以看到，在多核 CPU 使用率方面。执行 numpy.dot() 时，所有核心都在执行高使用率。因此 numpy 必须进行矢量化（通过 BLAS）以避免仅使用单核因为 CPython 的 GIL 限制。

Answer 3

看一下基本的例子

import numpy as np

x = np.array([1, 2, 3], np.int32)
print (type(x))
y = np.array([6, 7, 8], np.int32)
print (type(y))

现在我们将这两个数组相加

z=x+y
print (z)
print (type(z))

因此我们有

<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
[ 7  9 11]
<class 'numpy.ndarray'>

矢量化，是的。但是术语矢量在数学和物理学中有不同的含义，我们使用数组作为数学抽象。

numpy的基本操作是否被矢量化，即它们是否使用SIMD操作？

3 个答案: