Question

我使用此代码执行LinearRegression：

from sklearn.linear_model import LinearRegression
import pandas as pd

def calculate_Intercept_X_Variable():
    list_a=[['2018', '3', 'aa', 'aa', 93,1884.7746222667, 165.36153386251098], ['2018', '3', 'bb', 'bb', 62, 665.6392779848, 125.30386609565328], ['2018', '3', 'cc', 'cc', 89, 580.2259903521, 160.19280253775514]]
    df = pd.DataFrame(list_a)
    X = df.iloc[:, 5]
    y = df.iloc[:, 6]
    clf = LinearRegression()
    clf.fit(X, y)

calculate_Intercept_X_Variable()

但是错误消息是：

文件“ E：\ Anaconda3 \ lib \ site-packages \ sklearn \ utils \ validation.py”，第181行，格式为check_consistent_length “样本：％r”％[长度为l的int（l）]） ValueError：找到输入样本数量不一致的输入变量：[1、3]

哪里出问题了？

如何修改我的代码？

Answer 1

从scikit-learn文档中说：

fit（X，y，sample_weight = None）

X：数组状或稀疏矩阵，形状（n_samples，n_features）   训练数据
     y：数组状，形状（n_samples，n_targets）   目标值。如有必要，将被强制转换为X的dtype

问题是现在X和y是一维数组。

button[disabled] {
    opacity:0.4;
    cursor : not-allowed !important;
}

您应该重塑X和y：

X.shape, y.shape
# ((3,), (3,))

Python LinearRegression ValueError：找到样本数量不一致的输入变量：[1，3]

1 个答案: