我有b
2d m x n
灰度图像,我正在使用p x q
过滤器进行卷积,然后进行均值池化。有了纯粹的numpy,我想计算输入图像和滤镜的导数,但是我在计算输入图像的导数时遇到了麻烦:
def conv2d_derivatives(x, f, dy):
"""
dimensions:
b = batch size
m = input image height
n = input image width
p = filter height
q = filter width
r = output height
s = output width
input:
x = input image (b x m x n)
f = filter (p x q)
dy = derivative of some loss w.r.t. y (b x r x s)
output:
df = derivative of loss w.r.t. f (p x q)
dx = derivative of loss w.r.t. x (b x m x n)
notes:
wx = windowed version of x s.t. wx[b, r, s] = the window of x to compute y[b, r, s]
vdx = a view of dx
"""
b, m, n = x.shape
p, q = f.shape
r = m - p + 1
s = n - q + 1
wx = as_strided(x, (b, r, s, p, q), np.array([m * n, 1, q, 1, n]) * x.itemsize)
# This derivative is correct
df = 1 / (p * q) * np.einsum('brspq,brs->pq', wx, dy)
# Method 1: this derivative is incorrect
dx = np.zeros_like(x)
vdx = as_strided(dx, (b, r, s, p, q), np.array([m * n, 1, q, 1, n]) * dx.itemsize)
np.einsum('pq,brs->brspq', f, dy, out=vdx)
dx /= (p * q)
# Method 2: this derivative is correct, but it's slow and memory-intensive
dx = np.zeros_like(x)
vdx = as_strided(dx, (b, r, s, p, q), np.array([m * n, 1, q, 1, n]) * dx.itemsize)
prod = f[None, None, None, :, :] * dy[:, :, :, None, None]
for index in np.ndindex(*vdx.shape):
vdx[index] += prod[index]
dx /= (p * q)
return df, dx
我知道损失的衍生物w.r.t. w[b,r,s,p,q]
只是1/(p*q) * f[p,q] * dy[b,r,s]
。但是,我不想显式计算w
的导数并将它们存储在内存中,因为该数组会很大。
我认为我可以做一个dx
,vdx
视图的einsum,类似于窗口wdx
,并希望einsum增加vdx[b,r,s,p,q] += f[p,q] * dy[b,r,s]
,但它实际上是指定vdx[b,r,s,p,q] = f[p,q] * dy[b,r,s]
。如果有办法在einsum中指定out_add_to
,那么我的问题就会得到解决。
如果不在纯NumPy中存储大dx
矩阵,如何计算b x r x s x p x q
?我不能使用scipy或任何其他依赖来解决这个问题。