我使用此代码执行LinearRegression
:
from sklearn.linear_model import LinearRegression
import pandas as pd
def calculate_Intercept_X_Variable():
list_a=[['2018', '3', 'aa', 'aa', 93,1884.7746222667, 165.36153386251098], ['2018', '3', 'bb', 'bb', 62, 665.6392779848, 125.30386609565328], ['2018', '3', 'cc', 'cc', 89, 580.2259903521, 160.19280253775514]]
df = pd.DataFrame(list_a)
X = df.iloc[:, 5]
y = df.iloc[:, 6]
clf = LinearRegression()
clf.fit(X, y)
calculate_Intercept_X_Variable()
但是错误消息是:
文件“ E:\ Anaconda3 \ lib \ site-packages \ sklearn \ utils \ validation.py”,第181行,格式为check_consistent_length “样本:%r”%[长度为l的int(l)]) ValueError:找到输入样本数量不一致的输入变量:[1、3]
哪里出问题了?
如何修改我的代码?
答案 0 :(得分:1)
从scikit-learn文档中说:
fit(X,y,sample_weight = None)
X:数组状或稀疏矩阵,形状(n_samples,n_features) 训练数据
y:数组状,形状(n_samples,n_targets) 目标值。如有必要,将被强制转换为X的dtype
问题是现在X和y是一维数组。
button[disabled] {
opacity:0.4;
cursor : not-allowed !important;
}
您应该重塑X和y:
X.shape, y.shape
# ((3,), (3,))