Sklearn.linear_model:ValueError:找到样本数不一致的输入变量:[1,20]

时间:2017-12-22 19:31:14

标签: python pandas machine-learning regression linear-regression

我正在尝试实现线性回归,但是当我运行代码时,我收到此错误ValueError: Found input variables with inconsistent numbers of samples: [1, 20] in line-->linear.fit(x_train1,y_train1) [data type of x_train1,x is series & y_ is series]

我更改了x=dataset.iloc[:,:-1]数据类型的x_train,x更改为数据帧(y_仍然是系列)并且它可以正常工作

那么为什么它仅在x为数据帧时才起作用,尽管y仍为系列?

import pandas as pd
import numpy as np
import matplotlib.pyplot

dataset=pd.read_csv('Salary_Data.csv')

x=dataset.iloc[:,0]

y=dataset.iloc[:,1]

from sklearn.model_selection import train_test_split
x_train1,x_test1,y_train1,y_test1=
train_test_split(x,y,test_size=1/3,random_state=0)

#implementing simple linear regression
from sklearn.linear_model import LinearRegression

linear=LinearRegression()

linear.fit(x_train1,y_train1)

y_pred=linear.predict(x_test1)

1 个答案:

答案 0 :(得分:0)

Scikit-Learn不接受等级1 array(1维数据),即:如果你在x上调用形状方法:

x.shape

它将返回类似于(23,)的内容,23是应该为(23,1)的行数。

要使其正常工作,请尝试使用reshape

x = dataset.iloc[:,0]
x = x.reshape((len(x),1))
...