我从头开始编写了一个多变量线性回归代码,但是当我尝试运行该代码时,theta的最终值不正确(某些条目的顺序为10^20
)。有人可以帮我吗?
这是波士顿房屋预测数据集。我试图预测房屋价格。我根据Andrew Ng教授在其机器学习课程中提供的算法编写了线性回归代码。我试图在python中实现该算法。但是我的theta值仍然不正确。
这里是data link
这是我的代码:
import pandas as pd
import numpy as np
X_train = pd.read_csv("train.csv")
X_test = pd.read_csv("test.csv")
X_train.head()
X_train.shape
y_train = X_train['medv']
X_train = X_train.drop(columns = ['medv'], axis = 1)
theta = np.zeros(14)
alpha = 0.01
m = len(theta)
X_train.head()
X_train = X_train.drop(columns = ['ID'], axis = 1)
X_test = X_test.drop(columns = ['ID'], axis = 1)
X_train = np.column_stack((np.ones(len(X_train)),X_train))
X_train.shape
for j in range(1000):
for i in range(m):
h = np.dot(X_train, theta)
d_J = np.dot((h - y_train), X_train[:, i])
theta[i] = theta[i] - (alpha)*(1/m)*d_J
Theta值:
array([[ 5.41571429e+00],
[ 7.35302513e+00],
[ 5.96202743e+01],
[-6.13110873e+02],
[ 1.00881890e+02],
[ 9.36757919e+02],
[ 8.19165542e+03],
[-7.07535737e+05],
[ 3.54080584e+07],
[-1.02568786e+08],
[ 1.22841775e+11],
[-2.25615368e+14],
[ 3.50107077e+17],
[-3.56510417e+20]])
答案 0 :(得分:0)
如果我们可以链接到您的数据集