Question

我的教授给了我这份作业的提纲。它正在使用梯度下降法建立线性回归模型。我的问题是：

行向量如何由 θ= np.zeros（3）不是1 x 3矩阵？此表示为[0,0,0]
是否有办法解决我遇到的错误？如下所示，但基本上说我不能将两个矩阵相减，因为它们的大小不匹配。

为澄清起见，我无法更改theta的尺寸。

真正寻求了解这种减法应该如何工作

depth_array = np.array(depth_img, dtype=np.float32)
depth_array[np.isnan(depth_array)] = 0
depth_array *= 5000
depth_array = depth_array.astype(np.uint16)
print(np.max(depth_array)) # 20930

以下错误

def gradientDescent(X, y, theta, alpha, num_iters):
    '''
    Params
        X - Shape: (m,3); m is the number of data examples
        y - Shape: (m,)
        theta - Shape: (3,)
        num_iters - Maximum number of iterations
    Return
        A tuple: (theta, RSS, cost_array)
        theta - the learned model parameters
        RSS - residual sum of squares
        cost_array - stores the cost value of each iteration. Its shape is 
        (num_iters,)
    '''
    m = len(y)
    cost_array =[]

    for i in range(0, num_iters):
        #### START YOUR CODE ####
        # Make predictions
        # Shape of y_hat: m by 1
        y_hat = np.dot(X, theta)

        # Compute the difference between prediction (y_hat) and ground 
        truth label (y)
        diff = y_hat - y

        # Compute the cost
        # Hint: Use the diff computed above
        cost = np.sum((diff ** 2)/(2 * m))
        cost_array.append(cost)

        # Compute gradients
        # Hint: Use the diff computed above
        # Hint: Shape of gradients is the same as theta
        gradients = np.dot(np.transpose(X), diff) / m

        # Update theta
        theta = theta - alpha * gradient

        #### END YOUR CODE ####

    # Compute residuals
    # Hint: Should use the same code as Task 1
    #### START YOUR CODE ####
    y_hat = np.dot(X, theta)
    RSS = numpy.sum(numpy.square(y - y_hat))
    #### END YOUR CODE ####

    return theta, RSS, cost_array


# This cell is to evaluate the gradientDescent function implemented above

#### DO NOT CHANGE THE CODE BELOW ####
# Define learning rate and maximum iteration number
ALPHA = 0.05
MAX_ITER = 500

# Initialize theta to [0,0,0]
theta = np.zeros(3)
theta_method2, RSS2, cost_array = gradientDescent(X, y, theta, ALPHA, 
MAX_ITER)

print('Theta obtained from gradient descent:', theta_method2)
print('Residual sum of squares (RSS): ', RSS2)

Answer 1

要将theta的形状从（3，）更改为（1,3），您可以执行以下操作：

theta = np.expand_dims(theta, axis=0)  # Now theta.shape = (1,3)

但是，我尝试使用以下代码运行您的代码：

X = np.ones((10,3))
y = np.ones((10,))

作为测试。我只将theta = theta - alpha * gradient的行更改为theta = theta - alpha * gradients（在渐变的末尾添加了s）。这可能是引起问题的原因，因为此gradient函数的范围内没有gradientDescent。

这没有给出错误。

Python向量没有列大小？ / Theta初始化为全0

1 个答案: