我正在为具有2个隐藏层的神经网络编写代码。这是我的代码片段:
for i in range(200):
z1 = sigmoid(np.dot(W1, X_train) + b1)
z2 = sigmoid(np.dot(W2, z1) + b2)
z3 = sigmoid(np.dot(W3, z2) + b3)
C = cost(z3, Y_train)
print(C)
d3 = - np.multiply(np.multiply(z3, 1-z3), (np.divide(Y_train, z3) - np.divide(1 - Y_train, 1 - z3)))
dw3 = np.dot(d3, z2.T)
db3 = np.sum(d3, axis=1, keepdims=True)
d2 = np.multiply(np.dot(W3.T, d3), np.multiply(z2, 1-z2))
dw2 = np.dot(d2, z1.T)
db2 = np.sum(d2, axis=1, keepdims=True)
d1 = np.multiply(np.dot(W2.T, d2), np.multiply(z1, 1-z1))
dw1 = np.dot(d1, X_train.T)
db1 = np.sum(d1, axis=1, keepdims=True)
W1 -= 0.005*dw1
W2 -= 0.005*dw2
W3 -= 0.005*dw3
b1 -= 0.005*db1
b2 -= 0.005*db2
b3 -= 0.005*db3
成本函数:
def cost(p, y):
loss = -(np.multiply(y, np.log(p)) + np.multiply((1 - y), np.log(1 - p)))
c = np.sum(loss)*1.0 / p.shape[1]
return c
尺寸:
X_train : 784x60000
W1: 100x784
b1: 100x1
W2: 40x100
b2: 40x1
W3: 10x40
b3: 10x1
我的成本似乎有所下降一段时间,然后开始在某个点上振荡,我的准确性根本没有增加。我的代码出了什么问题?