每次都预测相同的值-回归

时间:2020-07-06 07:30:33

标签: python machine-learning regression prediction sklearn-pandas

我尝试了许多回归算法,例如Gradient Boosting,随机森林,决策树。 我尝试使用minmax和标准缩放器缩放。

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
x_train=scaler.fit_transform(x_train)
x_test=scaler.transform(x_test)
to_predict=scaler.transform(to_predict)

缩放后X_train的一些值:

       [-5.80641310e-01, -5.15194242e-01, -4.57304981e-01],
       [-5.99982344e-01, -5.74535988e-01, -4.57304981e-01],
       [-5.99982344e-01, -5.74535988e-01, -4.57304981e-01],
       [ 5.08258952e-01,  1.02769113e+00,  9.71784878e-03],
       [-5.99982344e-01, -5.74535988e-01, -4.57304981e-01],
       [-5.84509517e-01, -5.74535988e-01, -4.57304981e-01],
       [-5.98048241e-01, -5.74535988e-01, -4.57304981e-01]

缩放后X_test的一些值:

       [-0.59998234, -0.57453599, -0.45730498],
       [ 0.97244379,  2.15518429, -0.02141701],
       [ 2.50812195,  2.74860174,  3.18547309],
       [ 0.33612374,  0.37493194, -0.05255186],
       [-0.43364944, -0.51519424, -0.43862407],
       [ 2.37273471,  2.57057651,  3.92025568]

缩放后需要预测的数据中的某些值

       [ 10.46308958,   7.37725787,  18.92725594],
       [ 11.04912294,   8.2080423 ,  19.03934142],
       [ 12.04131803,   7.85199183,  19.96716011],
       [ 12.29468558,   7.85199183,  15.58337248],
       [ 13.15342753,   8.68277626,  19.99829496],
       [ 11.8053574 ,   8.32672579,  18.29833186],
       [ 10.82476694,   9.69158593,  21.86638628]

梯度提升回归:

grad=GradientBoostingRegressor(n_estimators=500,random_state=100,learning_rate=1,max_depth = 10,min_samples_leaf =3,min_samples_split = 12)
grad.fit(x_train,y_train)

mae test: 0.03380270193992188
mse test: 0.0025669439864247356
rmse test: 0.05066501738304977
r2 test: 0.80834162616979
mae train: 0.02458157407407408
mse train: 0.0025056432439638046
rmse train: 0.05005640062932816
r2 train: 0.815567744395517

一些测试集预测:-0.09690972,-0.09690972,-0.09690972、0.14249752

某些火车组预测:-0.11616,0.068165,0.048538,-0.09690972

对于要预测的数据的一些预测:0.124851,0.124851,0.124851, 0.124851

训练后的模型对于训练集和测试集效果很好,但是对我需要预测的行预测相同的常数。可能是因为缩放后的训练值和测试值具有相同的顺序,但是我要使用该模型的数据具有很高的值。我不知道如何解决这个问题。

如果我更改调整参数,则预测将仅转换为其他常数。对于我尝试过的所有回归算法,都会发生这种情况。我该如何解决这个问题?

0 个答案:

没有答案