线性回归的预测非常不准确

时间:2019-11-18 07:32:47

标签: python-3.x machine-learning linear-regression

这是我目前使用https://gist.github.com/netj/8836201的csv,我正在尝试预测 种类,该种类是分类数据,其中< strong>线性回归,但是以某种方式进行的预测非常不准确。虽然您知道,实际的标签只是0.0和1的组合,但预测是0.numbers和1.numbers,即使是负数,我认为这也是非常不准确的,我犯了什么错误,解决方案是什么对于这种不准确性?这是我老师给我的作业,他说我们不仅可以通过逻辑回归来预测线性回归,而且可以预测分类数据

import pandas as pd
from sklearn import model_selection
from sklearn.linear_model import LinearRegression
from sklearn import preprocessing
from sklearn import metrics

path= r"D:\python projects\iris.csv"
df = pd.read_csv(path)
array = df.values
X = array[:,0:3]
y = array[:,4]
le = preprocessing.LabelEncoder()
ohe = preprocessing.OneHotEncoder(categorical_features=[0])
y = le.fit_transform(y)
y = y.reshape(-1,1)
y = ohe.fit_transform(y).toarray()
X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, test_size=0.2, random_state=0)
sc = preprocessing.StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
y_train = sc.fit_transform(y_train)
model = LinearRegression(n_jobs=-1).fit(X_train, y_train)
y_pred = model.predict(X_test)
df = pd.DataFrame({'Actual': X_test.flatten(), 'Predicted': y_pred.flatten()})

输出:

y_pred
Out[46]: 
array([[-0.08676055,  0.43120144,  0.65555911],
       [ 0.11735424,  0.72384335,  0.1588024 ],
       [ 1.17081347, -0.24484483,  0.07403136],
X_test
Out[61]: 
array([[-0.09544771, -0.58900572,  0.72247648],
       [ 0.14071157, -1.98401928,  0.10361279],
       [-0.44968663,  2.66602591, -1.35915595],

1 个答案:

答案 0 :(得分:1)

线性回归用于预测连续输出数据。正确地说,您正在尝试预测类别(离散)输出数据。本质上,您希望进行分类而不是回归-线性回归不适用于此。

您还说过,逻辑回归可以并且应该代替逻辑回归,因为它适用于分类任务。

相关问题