卷积神经网络梯度检查问题

时间:2019-09-17 14:17:50

标签: python

我已经实现了深度学习网络:转换-> Relu-> Maxpool->展平->密集-> softmax。网络具有6178个参数。我正在尝试在我的深度学习网络上进行梯度检查。当我在单个数据点上执行梯度检查时,它会通过它,而我得到的差是 1.1969471336112197197e-08 。但是,当我在2个数据点上运行它时,会给我带来 0.3254100270774182 的巨大差异。这是我的渐变检查的代码:

def grad_check():

    train_set_x, train_set_y, test_set_x, test_set_y, n_class = load_data()
    train_set_x = train_set_x[0:2]
    train_set_y = train_set_y[:, 0:2]
    cnn = make_model(train_set_x, n_class)
    print (cnn.layers)

    A = cnn.forward(train_set_x)
    loss, dA = SoftmaxLoss(A, train_set_y)
    assert (A.shape == dA.shape)
    grads = cnn.backward(dA)
    grads_values = grads_to_vector(grads)
    initial_params = cnn.params
    parameters_values = params_to_vector(initial_params) # initial parameters
    num_parameters = parameters_values.shape[0]
    J_plus = np.zeros((num_parameters, 1))
    J_minus = np.zeros((num_parameters, 1))
    gradapprox = np.zeros((num_parameters, 1))
    print (num_parameters)
    epsilon = 1e-7
    assert (len(grads_values) == len(parameters_values))
    for i in tqdm(range(0, num_parameters)):


        thetaplus = copy.deepcopy(parameters_values)
        thetaplus[i][0] = thetaplus[i][0] + epsilon # parameters
        new_param = vector_to_param(thetaplus, initial_params)
        difference = compare(new_param, initial_params)
        assert ( difference == 1) # make sure only one parameter is changed
        cnn.params = new_param
        A = cnn.forward(train_set_x)
        J_plus[i], _ = SoftmaxLoss(A, train_set_y)

        thetaminus = copy.deepcopy(parameters_values)
        thetaminus[i][0] = thetaminus[i][0] - epsilon
        new_param = vector_to_param(thetaminus, initial_params)
        difference = compare(new_param, initial_params)
        assert (difference == 1)  # make sure only one parameter is changed
        cnn.params = new_param
        A = cnn.forward(train_set_x)
        J_minus[i], _ = SoftmaxLoss(A, train_set_y)

        gradapprox[i] = (J_plus[i] - J_minus[i]) / (2 * epsilon)

    numerator = np.linalg.norm(gradapprox - grads_values)
    denominator = np.linalg.norm(grads_values) + np.linalg.norm(gradapprox)
    difference = numerator / denominator

    if difference > 2e-7:
        print("\033[93m" + "There is a mistake in the backward propagation! difference = " + str(
                difference) + "\033[0m")
    else:
        print("\033[92m" + "Your backward propagation works perfectly fine! difference = " + str(
                difference) + "\033[0m")

    return difference

当我从一个数据点移动到两个数据点时,我无法理解为什么会有如此巨大的跳跃。我什至用3个数据点进行了检查,差异为 0.47068460998434125 。在这方面,我很乐意听取我的建议。

0 个答案:

没有答案
相关问题