Question

当我将此代码用于单变量线性回归时，theta的计算正确，但是在多变量上，theta的输出却很奇怪。

我正在尝试转换我在上吴安德（Andrew Ng）课程时编写的八度代码。

这是主调用文件：


m = data.shape[0]

a = np.array(data[0])
a.shape = (m,1)
b = np.array(data[1])
b.shape = (m, 1)
x = np.append(a, b, axis=1)
y = np.array(data[2])

lr = LR.LinearRegression()
[X, mu, sigma] = lr.featureNormalize(x)
z = np.ones((m, 1), dtype=float)
X = np.append(z, X, axis=1)
alpha = 0.01
num_iters = 400
theta = np.zeros(shape=(3,1))
[theta, J_history] = lr.gradientDescent(X, y, theta, alpha, num_iters)
print(theta)

这是课程的内容：

class LinearRegression:
    def featureNormalize(self, data):#this normalizes the features
        data = np.array(data)
        x_norm = data
        mu = np.zeros(shape=(1, data.shape[1]))#creates mu vector filled with zeros
        sigma = np.zeros(shape=(1, data.shape[1]))

        for i in range(0, data.shape[1]):
            mu[0, i] = np.mean(data[:, i])
            sigma[0, i] = np.std(data[:, i])

        for i in range(0, data.shape[1]):
            x_norm[:, i] = np.subtract(x_norm[:, i], mu[0, i])
            x_norm[:, i] = np.divide(x_norm[:, i], sigma[0, i])

        return [x_norm, mu, sigma]

    def gradientDescent(self, X, y, theta, alpha, num_iters):
        m = y.shape[0]
        J_history = np.zeros(shape=(num_iters, 1))

        for i in range(0, num_iters):
            predictions = X.dot(theta) # X is 47*3 theta is 3*1 predictions is 47*1
            theta = np.subtract(theta , (alpha / m) * np.transpose((np.transpose(np.subtract(predictions ,y))).dot(X))) #1*97 into 97*3
            J_history[i] = self.computeCost(X, y, theta)
        return [theta, J_history]

    def computeCost(self, X, y, theta):
        warnings.filterwarnings('ignore')
        m = X.shape[0]
        J = 0
        predictions = X.dot(theta)
        sqrErrors = np.power(predictions - y, 2)
        J = 1 / (2 * m) * np.sum(sqrErrors)
        return J

我期望theta将是3 * 1的矩阵。根据安德鲁的课程，我的八度音阶实现是产生theta

334302.063993 
 100087.116006 
 3673.548451

但是在python实现中，我得到的输出很奇怪：

[[384596.12996714 317274.97693463 354878.64955708 223121.53576488
  519238.43603216 288423.05420641 302849.01557052 191383.45903309
  203886.92061274 233219.70871976 230814.42009498 333720.57288972
  317370.18827964 673115.35724932 249953.82390212 432682.6678475
  288423.05420641 192249.97844569 480863.45534211 576076.72380674
  243221.70859887 245241.34318985 233604.4010228  249953.82390212
  551937.2817908  240336.51632605 446723.93690857 451051.7253178
  456822.10986344 288423.05420641 336509.59208678 163398.05571747
  302849.01557052 557707.6...................... this goes on for long

同一代码在“单变量”数据集中绝对可以正常工作。它在八度音阶中也可以正常工作，但是好像我现在错过了2个多小时的时间。很高兴获得您的帮助。

Answer 1

尝试渐变下降for循环的以下第二行：

theta=theta-(alpha/m)*X.T.dot(X.dot(theta)-y)

此外，如果要添加一列，则更容易这样做：

np.c_[np.ones((m,1)),data]

多变量的线性回归无法按预期运行

1 个答案: