我认为我的随机森林回归器评分标准的编码可能是错误的。我想仔细检查一下代码,为什么将R2应用于测试集时得到1。
我设置了以下分数以验证模型的预测能力:
训练集的表现
测试集的性能(真实性能)
我以为6和7是相同的,尽管我使用不同的方法来计算它们。 最后,这是具有5个因变量的多输出回归。
1。随机森林回归
from sklearn.ensemble import RandomForestRegressor
rf = RandomForestRegressor(n_estimators =400, random_state = 0,
max_depth=None, max_features = 'auto',
oob_score = True, bootstrap = True )
rf.fit(Xtrain, ytrain)
ypred = rf.predict(Xtest)
ypred_train = rf.predict(Xtrain)
2。定义性能指标
from sklearn.metrics import mean_squared_error, r2_score
#evaluation for the training set
def evaluate_train(rf, Xtrain, ytrain):
mse = 100*mean_squared_error(ytrain, ypred_train)
rmse = np.sqrt(mse)
print("Model Performance on Training")
print("%0.1f = Mean Squared Error"%(mse))
print("%0.1f = RMSE"%(rmse))
#for the test set
def evaluate_test(rf, Xtest, ytest):
mse = 100*mean_squared_error(ytest, ypred)
rmse = np.sqrt(mse)
r2 = r2_score(ytest, ypred, multioutput='uniform_average')
print("Model Performance on Test")
print("%0.1f = Mean Squared Error"%(mse))
print("%0.1f = RMSE"%(rmse))
print("%0.1f = R2 test"%(r2))\
3。调用函数以评估模型的性能
evaluate_train(rf, Xtrain, ytrain)
print("%0.3f = OOB R2 Score"%(rf.oob_score_))
evaluate_test(rf, Xtest, ytest)
print("%0.3f = Test R2 Score"%(rf.score(Xtest, ytest)))
使用此代码,这些是我得到的结果:
培训模型表现
测试中的模型性能
修改