我有一个随机森林模型来预测房屋的销售价格。调整后的模型输出如下:
> print(rf1)
Call:
randomForest(formula = price ~ ., data = train, mtry = 5, ntree = 300, importance = T, proximity = T, do.trace = T)
Type of random forest: regression
Number of trees: 300
No. of variables tried at each split: 5
Mean of squared residuals: 34804126985
% Var explained: 73.67
> cor(p2, test$price)
[1] 0.8523592
> caret::RMSE(p2, test$price)
[1] 197536.8
> mean(rf1$mse)
[1] 36350888740
我想知道mse和rmse的值是否可以接受,因为我通常知道的是,值越小越好。但是,在这种情况下,该值要高得多。同样,R ^ 2的值在0.7367左右,非常好。在回归的情况下,计算0.8523592的精度是否有意义?
调整第一个随机森林模型后的代码如下:
# tuning the rf model
t <- tuneRF(train[, -1], train[, 1],
ntreeTry = 300,
plot = T,
stepFactor = 0.5,
trace = T)
?tuneRF
# rf model again
rf1 <- randomForest(price ~ ., data = train, mtry = 5, ntree = 300, importance = T,
proximity = T, do.trace = T)
此外,从第一个模型开始,调整后的模型没有太大改进。