Python向量没有列大小? / Theta初始化为全0

时间:2019-09-18 03:05:26

标签: numpy machine-learning linear-regression gradient-descent

我的教授给了我这份作业的提纲。它正在使用梯度下降法建立线性回归模型。我的问题是:

  1. 行向量如何由    θ= np.zeros(3) 不是1 x 3矩阵?此表示为[0,0,0]
  2. 是否有办法解决我遇到的错误?如下所示,但基本上说我不能将两个矩阵相减,因为它们的大小不匹配。

为澄清起见,我无法更改theta的尺寸。

真正寻求了解这种减法应该如何工作

depth_array = np.array(depth_img, dtype=np.float32)
depth_array[np.isnan(depth_array)] = 0
depth_array *= 5000
depth_array = depth_array.astype(np.uint16)
print(np.max(depth_array)) # 20930

以下错误

def gradientDescent(X, y, theta, alpha, num_iters):
    '''
    Params
        X - Shape: (m,3); m is the number of data examples
        y - Shape: (m,)
        theta - Shape: (3,)
        num_iters - Maximum number of iterations
    Return
        A tuple: (theta, RSS, cost_array)
        theta - the learned model parameters
        RSS - residual sum of squares
        cost_array - stores the cost value of each iteration. Its shape is 
        (num_iters,)
    '''
    m = len(y)
    cost_array =[]

    for i in range(0, num_iters):
        #### START YOUR CODE ####
        # Make predictions
        # Shape of y_hat: m by 1
        y_hat = np.dot(X, theta)

        # Compute the difference between prediction (y_hat) and ground 
        truth label (y)
        diff = y_hat - y

        # Compute the cost
        # Hint: Use the diff computed above
        cost = np.sum((diff ** 2)/(2 * m))
        cost_array.append(cost)

        # Compute gradients
        # Hint: Use the diff computed above
        # Hint: Shape of gradients is the same as theta
        gradients = np.dot(np.transpose(X), diff) / m

        # Update theta
        theta = theta - alpha * gradient

        #### END YOUR CODE ####

    # Compute residuals
    # Hint: Should use the same code as Task 1
    #### START YOUR CODE ####
    y_hat = np.dot(X, theta)
    RSS = numpy.sum(numpy.square(y - y_hat))
    #### END YOUR CODE ####

    return theta, RSS, cost_array


# This cell is to evaluate the gradientDescent function implemented above

#### DO NOT CHANGE THE CODE BELOW ####
# Define learning rate and maximum iteration number
ALPHA = 0.05
MAX_ITER = 500

# Initialize theta to [0,0,0]
theta = np.zeros(3)
theta_method2, RSS2, cost_array = gradientDescent(X, y, theta, ALPHA, 
MAX_ITER)

print('Theta obtained from gradient descent:', theta_method2)
print('Residual sum of squares (RSS): ', RSS2)

1 个答案:

答案 0 :(得分:1)

要将theta的形状从(3,)更改为(1,3),您可以执行以下操作:

theta = np.expand_dims(theta, axis=0)  # Now theta.shape = (1,3)

但是,我尝试使用以下代码运行您的代码:

X = np.ones((10,3))
y = np.ones((10,))

作为测试。我只将theta = theta - alpha * gradient的行更改为theta = theta - alpha * gradients(在渐变的末尾添加了s)。这可能是引起问题的原因,因为此gradient函数的范围内没有gradientDescent

这没有给出错误。