我在纯python中执行多元线性回归,如下面的代码所示。有人可以告诉我他的代码中有什么问题吗? 我对单变量线性回归做了同样的事情。它表现很好!
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
x_df=pd.DataFrame([[2.0,70.0],[3.0,30.0],[4.0,80.0],[4.0,20.0],[3.0,50.0],[7.0,10.0],[5.0,50,0],[3.0,90.0],[2.0,20.0]])
y_df=pd.DataFrame([79.4,41.5,97.5,36.1,63.2,39.5,69.8,103.5,29.5])
x_df=x_df.drop(x_df.columns[2:], axis=1)
#print(x_df)
m=len(y_df)
#print(m)
x_df['intercept']=1
X=np.array(x_df)
#print(X)
#print(X.shape)
y=np.array(y_df).flatten()
#print(y.shape)
theta=np.array([0,0,0])
#print(theta)
def hypothesis(x,theta):
return np.dot(x,theta)
#print(hypothesis(X,theta))
def cost(x,y,theta):
m=y.shape[0]
h=np.dot(x,theta)
return np.sum(np.square(y-h))/(2.0*m)
#print(cost(X,y,theta))
def gradientDescent(x,y,theta,alpha=0.01,iter=1500):
m=y.shape[0]
for i in range(1500):
h=hypothesis(x,theta)
error=h-y
update=np.dot(error,x)
theta=np.subtract(theta,((alpha*update)/m))
print('theta',theta)
print('hyp',h)
print('y',y)
print('error',error)
print('cost',cost(x,y,theta))
print(gradientDescent(X,y,theta))
我得到的输出是: -
theta [ nan nan nan]
hyp [ nan nan nan nan nan nan nan nan nan]
y [ 79.4 41.5 97.5 36.1 63.2 39.5 69.8 103.5 29.5]
error [ nan nan nan nan nan nan nan nan nan]
cost nan
有人可以帮我解决这个问题吗?我已经被打了近5个小时的尝试!
答案 0 :(得分:0)
您的学习率太大而无法收敛,请尝试alpha = 0.00001。