我现在在Python中手动构建一个单层神经网络,而不使用像tensorflow这样的包。对于这个神经网络,每个输入是500维单热编码器,输出是表示每个类概率的3维向量。
神经网络有效,但我的问题是训练实例的数量非常大,略多于100万。而且因为我需要运行至少3个时期,所以我找不到一种有效的方法来存储权重矩阵。
我尝试使用3维numpy随机矩阵和字典来表示权重,然后执行权重更新。三维矩阵的第一维是训练实例的数量,后两个是与每个输入和隐藏层的维度匹配的对应维度。这两种方法都适用于小样本,但程序在完整样本时死亡。
#first feature.shape[0] is number of training samples, and feature.shape[1] is 500.
#d is the dimension of hidden layer
#using 3-d matrices
w_1 = np.random.rand(feature.shape[0], d,feature.shape[1])
b_1 = np.random.rand(feature.shape[0], 1,d)
w_2 = np.random.rand(feature.shape[0], 3, d)
b_2 = np.random.rand(feature.shape[0], 1, 3)
#iterate through every training epoch
for iteration in range(epoch):
correct, i = 0,0
#iterate through every training instance
while i < feature.shape[0]:
#net and out for hidden layer
net1 = feature[i].toarray().flatten().dot(w_1[i].T) + b_1[i].flatten()
h_1 = sigmoid(net1)
#net and out for output
y_hat = h_1.dot(w_2[i].T) + b_2[i].flatten()
prob = softmax(y_hat)
loss = squared_loss(label[i],prob)
#backpropagation steps omitted here
#using dictionaries
w_1 = {i: np.random.rand(d, feature.shape[1]) for i in range(feature.shape[0])}
b_1 = {i: np.random.rand(d) for i in range(feature.shape[0])}
w_2 = {i: np.random.rand(3, d) for i in range(feature.shape[0])}
b_2 = {i: np.random.rand(3) for i in range(feature.shape[0])}
for iteration in range(epoch):
correct, i = 0,0
while i < feature.shape[0]:
#net and out for hidden layer
net1 = feature[i].toarray().flatten().dot(w_1[i].T) + b_1[i]
h_1 = sigmoid(net1)
#output and probabilities
y_hat = h_1.dot(w_2[i].T) + b_2[i]
prob = softmax(y_hat)
loss = squared_loss(label[i],prob)
正如您所看到的,我需要首先初始化所有权重,以便当神经网络遍历每个时期时,权重可以更新并且不会丢失。但问题是这是低效的!程序死了!
所以有人可以对此提出任何建议吗?如何在每个训练时期存储权重和更新权重?
非常感谢任何帮助!