Question

我正在实现具有两个特征x1和x2的逻辑回归算法。我正在编写逻辑回归中的成本函数代码。

def computeCost(X,y,theta):
    J =((np.sum(-y*np.log(sigmoid(np.dot(X,theta)))-(1-y)*(np.log(1-sigmoid(np.dot(X,theta))))))/m)
    return J

这里My X是训练集矩阵，y是输出。由numpy库的shape属性确定，X的形状为（100,3），y的形状为（100，）。我的theta最初包含形状为（3,1）的所有零入口。当我使用这些参数计算成本时，得出的成本为69.314。但这是不正确的。正确的成本是0.69314。实际上，当我将y向量重塑为y = numpy.reshape(y,(-1,1))时，我得到了这个正确的成本。但实际上，我不知道这种重塑方法能如何纠正我的费用。这里m（训练集的数量）是100。

Answer 1

首先，以后永远都不要转储您的代码！您发布的（代码+说明）应尽可能具有描述性！（不冗长，没有人会读）。这是您的代码在做什么！请以后发布可读代码！否则，很难阅读和回答！

def computeCost(X,y,theta):
    '''
     Using Mean Absolute Error

     X:(100,3)
     y: (100,1)
     theta:(3,1)
     Returns 1D matrix of predictions
     Cost = ( log(predictions) + (1-labels)*log(1-predictions) ) / len(labels)
     '''
    m = len(y)
    # calculate the prediction
    predictions = sigmoid(np.dot(X,theta))

    # error for when label is of class1
    class1_cost= -y * np.log(predictions)
    # error for when label is of class1
    class2_cost= (1-y)*np.log(1-predictions)
    # total cost
    cost = class1_cost-class2_cost
    # averaging cost
    cost =cost.sum() / m
    return cost

您应该首先了解点积在数学中的工作原理，以及算法将采用哪种输入形式才能为您提供正确的答案！不要乱扔形状！您的feature_vector的形状为（100,3），当乘以theta时，shape（3,1）的形状即为形状（100,1）的预测矢量。

Matrix multiplication: The product of an M x N matrix and an N x K matrix is an M x K matrix. The new matrix takes the rows of the 1st and columns of the 2nd

因此，您的y尺寸应为（100,1）形状，而不是（100，）。差异很大！一个是[[3]，[4]，[6]，[7]，[9]，...]，另一个是[3,4,6,7,9，.....]。您的尺寸应该匹配才能正确输出！

问这个问题的更好方法是，如何使用标签的正确尺寸来计算逻辑回归中的错误/成本。！

进一步了解！

import numpy as np

label_type1= np.random.rand(100,1)
label_type2= np.random.rand(100,)
predictions= np.random.rand(100,1)
print(label_type1.shape, label_type2.shape, predictions.shape)

# When you mutiply (100,1) with (100,1) --> (100,1)
print((label_type1 * predictions).shape)

# When you do a dot product (100,1) with (100,1) --> Error, for which you have to take a transpose which isn't relavant to the context!
# print( np.dot(label_type1,predictions).shape) # error: shapes (100,1) and (100,1) not aligned: 1 (dim 1) != 100 (dim 0)
print( np.dot(label_type1.T,predictions).shape) # 
print('*'*5)

# When you mutiply (100,) with (100,1) --> (100,100) !
print((label_type2 * predictions).shape) # 

# When you  do a dot product (100,) with (100,1) --> (1,) !
print(np.dot(label_type2, predictions).shape) 
print('*'*5)

# what you are doin
label_type1_addDim = np.reshape(label_type2,(-1,1))
print(label_type1_transpose.shape)

因此，直截了当地说，您要实现的成本是昏暗（100,1）！所以要么做第一，要么做不到！或者您执行第五步，在不知不觉中为您的y添加维度使它从（100，）到（100,1），并执行与第一种情况相同的*操作！变暗（100,1）。

Logistic回归成本函数中的两种不同成本

1 个答案: