这是我的观点。
data = [[25593.14, 39426.66],
[98411.00, 81869.75],
[71498.80, 62495.80],
[38068.00, 54774.00],
[58188.00, 43453.65],
[10220.00, 18465.25]]
关于数据是我的数据模型。
x-cordinates指"薪水" y-cordinates指"费用"
我想在预算时预测费用"薪水"即,X坐标。
这是我的示例代码。请帮帮我。
from sklearn.linear_model import LinearRegression
data = [[25593.14, 39426.66],
[98411.00, 81869.75],
[71498.80, 62495.80],
[38068.00, 54774.00],
[58188.00, 43453.65],
[10220.00, 18465.25]]
salary=[]
expenses=[]
for dataset in data:
# import pdb; pdb.set_trace()
salary.append(dataset[0])
expenses.append(dataset[1])
model = LinearRegression()
model.fit(salary, expenses)
prediction = model.predict([10200.00])
print(prediction)
我得到的错误:
ValueError: Expected 2D array, got 1D array instead:
array=[ 25593.14 98411. 71498.8 38068. 58188. 10220. ].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample
。
答案 0 :(得分:4)
正如评论所建议的那样,这样的事情将是一种更好的方式来处理您想要提供给scikit学习模型的数据。另一个例子是here。
from sklearn.linear_model import LinearRegression
import numpy as np
data = np.array(
[[25593.14, 39426.66],
[98411.00, 81869.75],
[71498.80, 62495.80],
[38068.00, 54774.00],
[58188.00, 43453.65],
[10220.00, 18465.25]]
).T
salary = data[0].reshape(-1, 1)
expenses = data[1]
model = LinearRegression()
model.fit(salary, expenses)
prediction = model.predict(np.array([10200.00]).reshape(-1, 1))
print(prediction)
答案 1 :(得分:1)
工作代码:
from sklearn.linear_model import LinearRegression
import numpy as np
dataset = [[25593.14, 39426.66],
[98411.00, 81869.75],
[71498.80, 62495.80],
[38068.00, 54774.00],
[58188.00, 43453.65],
[10220.00, 18465.25]]
salary = np.array([data[0] for data in dataset]).reshape(-1,1)
expenses = np.array([data[1] for data in dataset]).reshape(-1,1)
model = LinearRegression()
model.fit(salary, expenses)
prediction = model.predict(10200.00)
print(prediction)
答案 2 :(得分:0)
快速修复,替换此行
model.fit(np.array([salary]), np.array([expenses]))
X应该是一个数组数组,array([arr1,arr2,array3,...])
与arr1相同,arr2是至少一个特征的数组,对于y是相同的,它应该是一个包含值列表{{1}的数组}