2隐层神经网络的维数不相关

时间:2017-12-13 22:31:20

标签: python machine-learning neural-network linear-algebra backpropagation

我试图单独使用numpy来实现2层神经网络。下面的代码只是计算前向传播。

训练数据是两个例子,其中输入是5维,输出是4维。当我尝试运行我的网络时:

album_name.parameterize

我收到错误:

# Two Layer Neural network

import numpy as np

M = 2
learning_rate = 0.0001

X_train = np.asarray([[1,1,1,1,1] , [1,1,1,1,1]])
Y_train = np.asarray([[0,0,0,0] , [1,0,0,0]])

X_trainT = X_train.T
Y_trainT = Y_train.T

def sigmoid(z):
    s = 1 / (1 + np.exp(-z))  
    return s

w1=np.zeros((Y_trainT.shape[0], X_trainT.shape[0]))
b1=np.zeros((Y_trainT.shape[0], 1))
A1 = sigmoid(np.dot(w1 , X_trainT))

w2=np.zeros((A1.shape[0], w1.shape[0]))
b2=np.zeros((A1.shape[0], 1))
A2 = sigmoid(np.dot(w2 , A1))

# forward propogation

dw1 =  ( 1 / M ) * np.dot((A1 - A2) , X_trainT.T / M)
db1 =  (A1 - A2).mean(axis=1, keepdims=True)
w1 = w1 - learning_rate * dw1
b1 = b1 - learning_rate * db1

dw2 =  ( 1 / M ) * np.dot((A2 - A1) , Y_trainT.T / M)
db2 =  (A2 - Y_trainT).mean(axis=1, keepdims=True)
w2 = w2 - learning_rate * dw2
b2 = b2 - learning_rate * db2

Y_prediction_train = sigmoid(np.dot(w2 , X_train) +b2)
print(Y_prediction_train.T)

我似乎在线性代数中误入歧途,但我不知道在哪里。

打印重量和相应的衍生物:

ValueError                                Traceback (most recent call last)
<ipython-input-42-f0462b5940a4> in <module>()
     36 b2 = b2 - learning_rate * db2
     37 
---> 38 Y_prediction_train = sigmoid(np.dot(w2 , X_train) +b2)
     39 print(Y_prediction_train.T)

ValueError: shapes (4,4) and (2,5) not aligned: 4 (dim 1) != 2 (dim 0)

打印:

print(w1.shape)
print(w2.shape)
print(dw1.shape)
print(dw2.shape)

如何将5个维度的培训示例纳入此网络?

我是否正确实现了正向传播?

来自@Imran现在使用这个网络回答:

(4, 5)
(4, 4)
(4, 5)
(4, 4)

打印:

# Two Layer Neural network

import numpy as np

M = 2
learning_rate = 0.0001

X_train = np.asarray([[1,0,1,1,1] , [1,1,1,1,1]])
Y_train = np.asarray([[0,1,0,0] , [1,0,0,0]])

X_trainT = X_train.T
Y_trainT = Y_train.T

def sigmoid(z):
    s = 1 / (1 + np.exp(-z))  
    return s

w1=np.zeros((Y_trainT.shape[0], X_trainT.shape[0]))
b1=np.zeros((Y_trainT.shape[0], 1))
A1 = sigmoid(np.dot(w1 , X_trainT))

w2=np.zeros((A1.shape[0], w1.shape[0]))
b2=np.zeros((A1.shape[0], 1))
A2 = sigmoid(np.dot(w2 , A1))

# forward propogation

dw1 =  ( 1 / M ) * np.dot((A1 - A2) , X_trainT.T / M)
db1 =  (A1 - A2).mean(axis=1, keepdims=True)
w1 = w1 - learning_rate * dw1
b1 = b1 - learning_rate * db1

dw2 =  ( 1 / M ) * np.dot((A2 - A1) , Y_trainT.T / M)
db2 =  (A2 - Y_trainT).mean(axis=1, keepdims=True)
w2 = w2 - learning_rate * dw2
b2 = b2 - learning_rate * db2

Y_prediction_train = sigmoid(np.dot(w2 , A1) +b2)
print(Y_prediction_train.T)

我认为[[ 0.5 0.5 0.4999875 0.4999875] [ 0.5 0.5 0.4999875 0.4999875]] 应该是。{ dw2 = ( 1 / M ) * np.dot((A2 - A1) , Y_trainT.T / M)为了将差异从图层隐藏图层1传播到隐藏图层2,这是正确的吗?

1 个答案:

答案 0 :(得分:1)

  

Y_prediction_train = sigmoid(np.dot(w2,X_train)+ b2)

w2是第二个隐藏图层的权重矩阵。这绝不能乘以您的输入X_train

要获得预测,您需要将传播分解为自己的函数,该函数接受输入X,首先计算A1 = sigmoid(np.dot(w1 , X)),然后返回A2 = sigmoid(np.dot(w2 , A1))的结果

<强>更新

  

我认为dw2 =(1 / M)* np.dot((A2-A1),Y_trainT.T / M)应改为dw2 =(1 / M)* np.dot((A2-A1), A1.T / M)为了将差异从层隐藏层1传播到隐藏层2,这是正确的吗?

反向传播向后传播错误。第一步是计算损失函数相对于输出的梯度,如果使用Mean Squared Error,则为A2-Y。然后,这将根据第2层的权重和偏差的损失梯度的条款输入,所以返回到第1层。您不希望在第1层到第2层期间传播任何内容。 backprop。

看起来你在更新后的问题中几乎没有,但我认为你想要:

dW2 = ( 1 / M ) * np.dot((A2 - Y) , A1.T)

还有几点说明:

  1. 您正在将您的权重初始化为零。这将不允许神经网络在训练期间破坏对称性,并且您将在每个神经元处获得相同的权重。您应该尝试使用[-1,1]范围内的随机权重进行初始化。
  2. 您应该将前向和后向传播步骤放在一个循环中,这样您就可以在错误仍在改善的情况下运行多个时期。