我想知道如何将numpy的所有列表都更改为numpy,然后使用numpy进行矢量化重写。
# dcdw1 = m x m array
# a1 = len(x) x m array
# a2 = len(x) x 1 array
# w2 = m x 1 array
# x = len(x) x m array
# y = len(x) x 1 array
for i in range(len(x)):
for j in range(m):
for k in range(m):
dcdw1[k, j] = (a2[i] - y[i]) * a2[i] * (1 - a2[i]) * w2[j] * a1[i, j] * (1 - a1[i, j]) * x[i, k]
# other stuff that uses dcdw1
答案 0 :(得分:1)
# dcdw1 = m x m array
# a1 = len(x) x m array
# a2 = len(x) x 1 array
# w2 = m x 1 array
# x = len(x) x m array
# y = len(x) x 1 array
import numpy as np
m = 10
lx = 4 # len(x)
dcdw1 = np.zeros([lx, m, m])
dcdw2 = np.zeros_like(dcdw1)
a1 = np.ones([lx, m]) * 0.5
a2 = np.ones([lx, 1]) * 2
w2 = np.ones([m, 1]) * 3
x = np.ones([lx, m]) * 4
y = np.ones([lx, 1]) * 5
for i in range(lx):
for j in range(m):
for k in range(m):
dcdw1[i, k, j] = (a2[i] - y[i]) * a2[i] * (1 - a2[i]) * w2[j] * a1[i, j] * (1 - a1[i, j]) * x[i][k]
# Why are you using j on rows and k on columns? anyways
print(dcdw1[-1])
first_term = np.reshape( (a2-y) * a2 * (1-a2), [lx, 1, 1] )
# this is on 3d tensor level applied to each matrix seperately
# corresponds to (a2[i] - y[i]) * a2[i] * (1 - a2[i])
print(first_term.shape) # [lx, 1, 1] obviously
a1_term = (a1 * (1-a1))[:, :, np.newaxis]
# On each matrix calculate this vector product [lx, m] and shape to [lx, m, 1]
print(a1_term.shape)
row_level_term = a1_term * w2 # Element wise multiplication yet again
# w2 is [m, 1] so it is broadcasted to every matrix
row_level_tensor = first_term * row_level_term
# this applies first term values to every matrix -> [lx, m, 1]
print(row_level_tensor.shape)
x = np.reshape(x, [lx, 1, 10])
# x is weird. Foreach matrix it is used as a coefficient for matrix rows
# x[i][k] # ignoring i, k is basically telling takes this row vector
# and dstack it m times with different coeffs
# to create giant linearly dependent matrices
print(x.shape)
dcdw2 = np.matmul(row_level_tensor, x) # mxm matrix product lx times
print(dcdw2[-1])
这很丑陋,但是可以完成工作(两次重塑和换新轴,嗯。我猜人们通常不会对张量执行元素级矩阵运算,至少我不这样做)。我不喜欢覆盖dcdw1
。上面的代码创建了一个张量,其中当前的dcdw1
是最后一个元素。我用循环将其与您的串行代码进行了比较,结果是相同的。不过,您需要稍微调整一下当前代码。
这是代码的Colab link。
最欢迎提出改进和建议。
答案 1 :(得分:1)
在这一行
dcdw1[k, j] = (a2[i] - y[i]) * a2[i] * (1 - a2[i]) * w2[j] * a1[i, j] * (1 - a1[i, j]) * x[i, k]
长部分(a2[i] - y[i]) * a2[i] * (1 - a2[i]) * w2[j] * a1[i, j] * (1 - a1[i, j])
(我将其分配为temp
)将产生一个len(x) x m
数组,而x
是一个len(x) x m
数组。因此,仅凭m x m
运算符就无法在这里获得*
数组。
您是不是要按以下方式将dcdw1[k, j]
中每个i
的结果添加到range(len(x))
中?
dcdw1 = np.zeros([m,m])
for i in range(len(x)):
for j in range(m):
for k in range(m):
dcdw1[k, j] += (a2[i] - y[i]) * a2[i] * (1 - a2[i]) * w2[j] * a1[i, j] * (1 - a1[i, j]) * x[i][k]
如果是这样,这是您想要的代码:
import numpy as np
# dcdw1 = m x m array
# a2 = len(x) x 1 array
# y = len(x) x 1 array
# w2 = m x 1 array
# a1 = len(x) x m array
# x = len(x) x m
temp = (a2-y) * a2 * (1-a2) * w2.T * a1 * (1-a1)
dcdw1 = np.dot(temp.T, x).T
为什么要使用w2.T
?由于w2
是形状为m x 1
的列向量。无法将其广播到len(x) x m
数组,因为它们的行数不匹配。相反,我将转置w2
,以使其列数与a1 * (1-a1)
s'相匹配。类似于temp
。