Question

我试图从头开始构建一个具有单个隐藏层的神经网络。在反向传播部分，出现了一些问题。

在隐藏层中，我正在使用Sigmoid激活，在输出层中，我正在使用softmax函数。

为了计算损耗函数相对于第1层重量的梯度，公式变为：

$\frac{dL}{dW1} = \frac{dL}{dA1}*\frac{dA1}{dZ1}*\frac{dZ1}{dW1}$

哪里

$\frac{dL}{dA1} = \frac{dL}{dA2}*\frac{dA2}{dZ2}*\frac{dZ2}{dA1}$

def back_prop(parameters,cache,X,Y):
    m = X.shape[1]
    #retriving parameters
    W1 = parameters["W1"]
    W2 = parameters["W2"]

    A1 = cache["A1"]
    A2 = cache["A2"]
    Z1 = cache["Z1"]
    Z2 = cache["Z2"]

    #back propagation calculation
    #calculating dL1,dL2 for updating W1 and W2
    # loss for W2 #
    xa = np.divide(Y,A2)+np.divide((1-Y),(1-A2)) #part of dA2 calculation
    dA2 = (1/m) * np.sum(xa)
    dZ2 = sigmoid_derivative(Z2)
    dW2 = A2
    dL2 = dA2*dZ2*dW2
    # loss for W1 #
    dA1 = W2
    print(dA1.shape)
    dZ1 = softmax_derivative(Z1)
    print(dZ1.shape)
    dW1 = X.T
    print(dW1.shape)

    dl0 = np.dot(dA2,dZ2.T)
    print('shape dl0',dl0.shape)

    dl1 = np.dot(dl0,dA1)
    print('shape dl1',dl1.shape)

    dl2 = np.dot(dl1,dZ1.T)
    print('shape dl2', dl2.shape)
    dL1 = np.dot(dl2,dW1) ###error : dimension for broadcasting is not satisfied
    print('shape dL1',dL1.shape)
    grads ={"dL1" : dL1,
            "dL2" : dL2}

return grads

我正在尝试使用 np.dot（）进行操作，但尺寸出现问题。当我去更新W1时，dL / dW1尺寸不合适。

这是我的完整代码，请原谅我，因为在反向传播部分有点混乱。

https://gist.github.com/ipritom/30fcad0c74ab59e5b31e1daac1c1d1e7

尺寸的反向传播部分存在问题

0 个答案: