Question

当我使用random_state=10时。 y_test [1]有效，但y_test [2或3或...]无效。因此，我尝试使用其他random_state（例如42或20），但是更改后，random_state.y_test [10]也不起作用。

我不确定问题是来自y_test还是来自random_state。我想弄清楚为什么更改random_state的值也会更改r2_score的精度。

非常感谢

x = df[['mileage','engine_power','feature_1','feature_2','feature_3','feature_4','feature_5','feature_6','feature_7','feature_8','car_type']]
y = df['price']

x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.2,random_state=10)

lr =LinearRegression()
lr.fit(x_train,y_train)
predict_lr = lr.predict(x_test)

print('real value y_test[1]:'+str(y_test[1])+'  predict:'+str(lr.predict(x_test.iloc[[1],:])))
print('real value y_test[2]:'+str(y_test[2])+'  predict:'+str(lr.predict(x_test.iloc[[2],:])))
print('scort:',lr.score(x_test,y_test))
print('r2 score:',r2_score(y_test,predict_lr))

>>>real value y_test[1]:69700  predict:[12659.21124934]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-48-e556e02e9019> in <module>
      3 predict_lr = lr.predict(x_test)
      4 print('real value y_test[1]:'+str(y_test[1])+'  predict:'+str(lr.predict(x_test.iloc[[1],:])))
----> 5 print('real value y_test[2]:'+str(y_test[5])+'  predict:'+str(lr.predict(x_test.iloc[[2],:])))
      6 
      7 print('scort:',lr.score(x_test,y_test))
KeyError: 5

我应该如何选择random_state以获得精确的结果

0 个答案: