如何提高多元线性回归的模型精度

时间:2018-09-17 16:38:32

标签: python python-3.x machine-learning scikit-learn linear-regression

这是自定义代码

#Custom model for multiple linear regression
import numpy as np

import pandas as pd

dataset = pd.read_csv("50s.csv")

x  = dataset.iloc[:,:-1].values

y  = dataset.iloc[:,4:5].values

from sklearn.preprocessing import LabelEncoder

lb = LabelEncoder()

x[:,3] = lb.fit_transform(x[:,3])

from sklearn.preprocessing import OneHotEncoder

on = OneHotEncoder(categorical_features=[3])

x = on.fit_transform(x).toarray()

x = x[:,1:]

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=1/5, random_state=0)


con = np.matrix(X_train)

z = np.matrix(y_train)

#training model

result1 = con.transpose()*con

result1 = np.linalg.inv(result1)

p = con.transpose()*z

f = result1*p

l = []

for i in range(len(X_test)):

    temp = f[0]*X_test[i][0] + f[1]*X_test[i][1] +f[2]*X_test[i][2]+f[3]*X_test[i][3]+f[4]*X_test[i][4]         
    l.append(temp)


import matplotlib.pyplot as plt

plt.scatter(y_test,l)

plt.show()

然后我创建了一个使用scikit Learn创建的模型 并将结果与​​y_test和l(上述代码的预测值)进行比较

比较如下

for i in range(len(prediction)):
    print(y_test[i],prediction[i],l[i],sep='  ')

103282.38  103015.20159795816  [[116862.44205399]]
144259.4  132582.27760816005  [[118661.40080974]]
146121.95  132447.73845175043  [[124952.97891882]]
77798.83  71976.09851258533  [[60680.01036438]]

这是y_test,scikit学习模型预测与自定义代码预测之间的比较

请帮助确保模型的准确性。

蓝色:自定义模型预测
黄色:scikit学习模型预测

0 个答案:

没有答案
相关问题