Question

以下是基于Coursera深度学习课程的用于计算成本函数和梯度以对图像进行分类的代码。

按如下方式计算费用

cost = -np.sum(Y*np.log(A) + (1-Y)*np.log(1-A)) / m

cost.shape是

()

那么以下操作的目的是什么

cost = np.squeeze(cost)

在功能中

def propagate(w, b, X, Y):
    """
    Implement the cost function and its gradient for the propagation

    Arguments:
    w -- weights, a numpy array of size (num_px * num_px * 3, 1)
    b -- bias, a scalar
    X -- data of size (num_px * num_px * 3, number of examples)
    Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)

    Return:
    cost -- negative log-likelihood cost for logistic regression
    dw -- gradient of the loss with respect to w, thus same shape as w
    db -- gradient of the loss with respect to b, thus same shape as b

    """
    m = X.shape[1]

    # FORWARD PROPAGATION (FROM X TO COST)
    A = sigmoid(np.dot(w.T, X) + b)          # compute activation
    cost = -np.sum(Y*np.log(A) + (1-Y)*np.log(1-A)) / m      # compute cost

    # BACKWARD PROPAGATION (TO FIND GRAD)
    dw = np.dot(X, (A-Y).T) / m
    db = np.sum(A-Y) / m

    assert(dw.shape == w.shape)
    assert(db.dtype == float)
    cost = np.squeeze(cost)
    assert(cost.shape == ())

    grads = {"dw": dw,
             "db": db}

    return grads, cost

Answer 1

np.squeeze用于删除numpy.ndarray中具有Singleton元素的轴。例如，如果您有一个形状为a的NumPy数组(n,m,1,p)，则np.squeeze(a)将使其形状为(n,m,p)，从而减少了第三个轴，因为它只有一个元素。

在这里，cost应该是单个值。尽管它是形状为np.ndarray的{{1}}，但在计算自身之后，会显着采取额外的步骤()，以确保如果它确实包含任何冗余轴，则将其删除。

Answer 2

# GRADED FUNCTION: propagate

def propagate(w, b, X, Y):
    """
    Implement the cost function and its gradient for the propagation explained above

    Arguments:
    w -- weights, a numpy array of size (num_px * num_px * 3, 1)
    b -- bias, a scalar
    X -- data of size (num_px * num_px * 3, number of examples)
    Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)

    Return:
    cost -- negative log-likelihood cost for logistic regression
    dw -- gradient of the loss with respect to w, thus same shape as w
    db -- gradient of the loss with respect to b, thus same shape as b
    
    Tips:
    - Write your code step by step for the propagation. np.log(), np.dot()
    """
    
    m = X.shape[1]
    
    # FORWARD PROPAGATION (FROM X TO COST)
    ### START CODE HERE ### (≈ 2 lines of code)
    A =   sigmoid(np.dot(w.T,X) + b)                                 # compute activation
    print(A.shape)
    print(Y.shape)
    cost = -1*np.sum((np.multiply(Y,np.log(A)) +np.multiply((1-Y), np.log(1 - A)) ))
    print(cost.shape)# compute cost
    ### END CODE HERE ###
    
    # BACKWARD PROPAGATION (TO FIND GRAD)
    ### START CODE HERE ### (≈ 2 lines of code)
    dw = 1/m*np.dot(X, (A - Y).T)
    db = 1/m*np.sum(A - Y)
    
    ### END CODE HERE ###

    assert(dw.shape == w.shape)
    assert(db.dtype == float)
    cost = np.squeeze(cost)
    assert(cost.shape == ())
    
    grads = {"dw": dw,
             "db": db}
    
    return grads, cost

用于实现成本函数和梯度的np.squeeze（）

2 个答案: