拟合线性回归时出错" ValueError:找到具有不一致样本数的数组"

时间:2017-07-25 10:43:58

标签: python-3.x scikit-learn linear-regression data-science

我正在尝试执行以下代码:

import numpy as np
from sklearn import linear_model

class MarketingCosts:

    def desired_marketing_expenditure(marketing_expenditure, units_sold, desired_units_sold):
        model = linear_model.LinearRegression()
        model.fit(units_sold, marketing_expenditure)
        output = model.predict(desired_units_sold)
        return output


print(MarketingCosts.desired_marketing_expenditure(
    [300000, 200000, 400000, 300000, 100000],
    [60000, 50000, 90000, 80000, 30000],
    60000))

但是,运行时出现以下错误:

exec(code, run_globals) 
  File "marketingcosts.py", in  
    60000)) 
  File "marketingcosts.py", in desired_marketing_expenditure 
    model.fit(units_sold, marketing_expenditure) 
ValueError: Found arrays with inconsistent numbers of samples: [1 5]

有人知道为什么会这样吗?我还尝试使用np.array作为参数来创建model.fit,但它会引发类似的错误。

提前致谢

1 个答案:

答案 0 :(得分:0)

此代码有效。您需要转置数据:

Xtrain = np.array([[300000, 200000, 400000, 300000, 100000],]).T
y = np.array([[60000, 50000, 90000, 80000, 30000],]).T
Xtest = np.array([[60000],]).T

print(MarketingCosts.desired_marketing_expenditure(y, Xtrain, Xtest))

我错放了参数,现在它被修复了。现在输出是22000,看起来不错。