Question

是否可以将两个ndarray A和B相乘并将结果添加到C，而不会为A次B创建一个大的中间数组？

对于C = A乘以B的情况，Numpy有out关键字参数：

numpy.multiply(A, B, out=C)

C + = A次B的情况怎么样？

Answer 1

Numpy一次只支持一个操作。话虽如此，有几种解决方法。

到位操作

最简单的解决方案是通过+=和*=

使用就地操作

import numpy as np
import scipy

n = 100
b = 5.0

x = np.random.rand(n)
y = np.random.rand(n)

z = b * x
z += y

BLAS

您可以访问基础BLAS程序并手动应用它们。遗憾的是，没有乘法加法指令，但有“AXPY”指令执行

y <- a * x + y

可以通过以下方式调用：

import scipy

axpy = scipy.linalg.blas.get_blas_funcs('axpy', arrays=(x, y))
axpy(x, y, n, b)

Numexpr

另一种选择是使用像numexpr这样的包，它允许你编译表达式：

import numexpr

z = numexpr.evaluate('b * x + y')

Theano

最近有几个机器学习包开始支持编译表达式，其中一个包就是theano。你可以这样做：

import theano

x = theano.tensor.vector()         # declare variable
y = theano.tensor.vector()         # declare variable

out = b * x + y                    # build symbolic expression
f = theano.function([x, y], out)   # compile function

z = f(x, y)

Answer 2

我比较了不同的变体，发现 SciPy 的 BLAS 接口不会出错

scipy.linalg.blas.daxpy(x, y, len(x), a)

重现情节的代码：

import numexpr
import numpy as np
import perfplot
import scipy.linalg
import theano

a = 1.36

# theano preps
x = theano.tensor.vector()
y = theano.tensor.vector()
out = a * x + y
f = theano.function([x, y], out)


def setup(n):
    x = np.random.rand(n)
    y = np.random.rand(n)
    return x, y


def manual_axpy(data):
    x, y = data
    return a * x + y


def manual_axpy_inplace(data):
    x, y = data
    out = a * x
    out += y
    return out


def scipy_axpy(data):
    x, y = data
    n = len(x)
    axpy = scipy.linalg.blas.get_blas_funcs("axpy", arrays=(x, y))
    axpy(x, y, n, a)
    return y


def scipy_daxpy(data):
    x, y = data
    return scipy.linalg.blas.daxpy(x, y, len(x), a)


def numpexpr_evaluate(data):
    x, y = data
    return numexpr.evaluate("a * x + y")


def theano_function(data):
    x, y = data
    return f(x, y)


b = perfplot.bench(
    setup=setup,
    kernels=[
        manual_axpy,
        manual_axpy_inplace,
        scipy_axpy,
        scipy_daxpy,
        numpexpr_evaluate,
        theano_function,
    ],
    n_range=[2 ** k for k in range(24)],
    equality_check=None,
    xlabel="len(x), len(y)",
)
# b.save("out.png")
b.show()

Answer 3

据我了解，NumPy数组操作一次只能进行一次，但是通过将其置于函数内部，您可以确保它不在内存中，正如评论者所建议的那样。

Numpy融合倍增并添加以避免浪费记忆

3 个答案:

到位操作

BLAS

Numexpr

Theano