我试图单独使用numpy来实现2层神经网络。下面的代码只是计算前向传播。
训练数据是两个例子,其中输入是5维,输出是4维。当我尝试运行我的网络时:
album_name.parameterize
我收到错误:
# Two Layer Neural network
import numpy as np
M = 2
learning_rate = 0.0001
X_train = np.asarray([[1,1,1,1,1] , [1,1,1,1,1]])
Y_train = np.asarray([[0,0,0,0] , [1,0,0,0]])
X_trainT = X_train.T
Y_trainT = Y_train.T
def sigmoid(z):
s = 1 / (1 + np.exp(-z))
return s
w1=np.zeros((Y_trainT.shape[0], X_trainT.shape[0]))
b1=np.zeros((Y_trainT.shape[0], 1))
A1 = sigmoid(np.dot(w1 , X_trainT))
w2=np.zeros((A1.shape[0], w1.shape[0]))
b2=np.zeros((A1.shape[0], 1))
A2 = sigmoid(np.dot(w2 , A1))
# forward propogation
dw1 = ( 1 / M ) * np.dot((A1 - A2) , X_trainT.T / M)
db1 = (A1 - A2).mean(axis=1, keepdims=True)
w1 = w1 - learning_rate * dw1
b1 = b1 - learning_rate * db1
dw2 = ( 1 / M ) * np.dot((A2 - A1) , Y_trainT.T / M)
db2 = (A2 - Y_trainT).mean(axis=1, keepdims=True)
w2 = w2 - learning_rate * dw2
b2 = b2 - learning_rate * db2
Y_prediction_train = sigmoid(np.dot(w2 , X_train) +b2)
print(Y_prediction_train.T)
我似乎在线性代数中误入歧途,但我不知道在哪里。
打印重量和相应的衍生物:
ValueError Traceback (most recent call last)
<ipython-input-42-f0462b5940a4> in <module>()
36 b2 = b2 - learning_rate * db2
37
---> 38 Y_prediction_train = sigmoid(np.dot(w2 , X_train) +b2)
39 print(Y_prediction_train.T)
ValueError: shapes (4,4) and (2,5) not aligned: 4 (dim 1) != 2 (dim 0)
打印:
print(w1.shape)
print(w2.shape)
print(dw1.shape)
print(dw2.shape)
如何将5个维度的培训示例纳入此网络?
我是否正确实现了正向传播?
来自@Imran现在使用这个网络回答:
(4, 5)
(4, 4)
(4, 5)
(4, 4)
打印:
# Two Layer Neural network
import numpy as np
M = 2
learning_rate = 0.0001
X_train = np.asarray([[1,0,1,1,1] , [1,1,1,1,1]])
Y_train = np.asarray([[0,1,0,0] , [1,0,0,0]])
X_trainT = X_train.T
Y_trainT = Y_train.T
def sigmoid(z):
s = 1 / (1 + np.exp(-z))
return s
w1=np.zeros((Y_trainT.shape[0], X_trainT.shape[0]))
b1=np.zeros((Y_trainT.shape[0], 1))
A1 = sigmoid(np.dot(w1 , X_trainT))
w2=np.zeros((A1.shape[0], w1.shape[0]))
b2=np.zeros((A1.shape[0], 1))
A2 = sigmoid(np.dot(w2 , A1))
# forward propogation
dw1 = ( 1 / M ) * np.dot((A1 - A2) , X_trainT.T / M)
db1 = (A1 - A2).mean(axis=1, keepdims=True)
w1 = w1 - learning_rate * dw1
b1 = b1 - learning_rate * db1
dw2 = ( 1 / M ) * np.dot((A2 - A1) , Y_trainT.T / M)
db2 = (A2 - Y_trainT).mean(axis=1, keepdims=True)
w2 = w2 - learning_rate * dw2
b2 = b2 - learning_rate * db2
Y_prediction_train = sigmoid(np.dot(w2 , A1) +b2)
print(Y_prediction_train.T)
我认为[[ 0.5 0.5 0.4999875 0.4999875]
[ 0.5 0.5 0.4999875 0.4999875]]
应该是。{
dw2 = ( 1 / M ) * np.dot((A2 - A1) , Y_trainT.T / M)
为了将差异从图层隐藏图层1传播到隐藏图层2,这是正确的吗?
答案 0 :(得分:1)
Y_prediction_train = sigmoid(np.dot(w2,X_train)+ b2)
w2
是第二个隐藏图层的权重矩阵。这绝不能乘以您的输入X_train
。
要获得预测,您需要将传播分解为自己的函数,该函数接受输入X
,首先计算A1 = sigmoid(np.dot(w1 , X))
,然后返回A2 = sigmoid(np.dot(w2 , A1))
的结果
<强>更新强>
我认为dw2 =(1 / M)* np.dot((A2-A1),Y_trainT.T / M)应改为dw2 =(1 / M)* np.dot((A2-A1), A1.T / M)为了将差异从层隐藏层1传播到隐藏层2,这是正确的吗?
反向传播向后传播错误。第一步是计算损失函数相对于输出的梯度,如果使用Mean Squared Error,则为A2-Y
。然后,这将根据第2层的权重和偏差的损失梯度的条款输入,所以返回到第1层。您不希望在第1层到第2层期间传播任何内容。 backprop。
看起来你在更新后的问题中几乎没有,但我认为你想要:
dW2 = ( 1 / M ) * np.dot((A2 - Y) , A1.T)
还有几点说明: