Question

我正在尝试训练一个简单的线性模型，但由于某种原因，它不断遇到错误，因为它不喜欢数据的形状。有谁知道如何使这项工作？我尝试了各种重塑技术，但每次都遇到问题（例如 .reshape 或 .values.reshape）。

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd 
import sklearn.linear_model 

#create data 
df1 = pd.DataFrame()
df1['First']=[1,2,3,4,5,6,7,8,9,10]
df1['Second']=[2,4,6,8,10,12,14,16,18,20]

x=df1['First']
y=df1['Second']

#select model 
model = sklearn.linear_model.LinearRegression()

#train
model.fit(x,y)

#predict 
X_new = [[5]]
print(model.predict(X_new))

Answer 1

sklearn 分类器（例如 LinearRegression）总是将 X 作为 N_samples x N_features 形状的二维 arraylike。使用 X 重塑 (-1, 1) 会将列表转换为 10 x 1 数组。

import numpy as np
# import pandas as pd
from sklearn.linear_model import LinearRegression

# data
X = np.asarray([1,2,3,4,5,6,7,8,9,10])
y = np.asarray([2,4,6,8,10,12,14,16,18,20])

# create DataFrame
# df1 = pd.DataFrame(data={'First': X, 'Second': y})

# select model 
model = LinearRegression()
# train model -> 2D array for X
model.fit(X.reshape((-1, 1)), y)

# predict 
X_new = [[5]]
print(model.predict(X_new))

此外（我假设您以后需要它）您根本不需要 pandas.DataFrame。

线性模型问题

1 个答案: