是否有可能为每棵树获取随机森林算法使用的OOB样本? 我正在使用R语言。 我知道RandomForest算法使用了近66%的数据(随机选择)来成长每棵树,34%的数据作为OOB样本来测量OOB错误,但我不知道如何获取这些OOB样本每棵树?
有什么想法吗?
答案 0 :(得分:1)
Assuming you are using the remainingPhotos[position] = { Key: originalPhotoArray[i],
Thumb: arrayPhotoThumb[i],
Title: titleArray[i],
Width: widthArray[i],
Height: heightArray[i]
};
package, you just need to set the randomForest
argument to keep.inbag
.
TRUE
The output list will contain an n by ntree matrix that can be accessed by the name library(randomForest)
set.seed(1)
rf <- randomForest(Species ~ ., iris, keep.inbag = TRUE)
.
inbag
The values in the matrix tell you how many times a sample was in-bag. For example, the value of 2 in row 5 column 3 above says that the 5th observation was included in-bag twice for the 3rd tree.
As a bit of background here, a sample can show up in-bag more than once (hence the 2) because by default the sampling is done with replacement.
You can also sample without replacement via the dim(rf$inbag)
# [1] 150 500
rf$inbag[1:5, 1:3]
# [,1] [,2] [,3]
# 1 0 1 0
# 2 1 1 0
# 3 1 0 1
# 4 1 0 1
# 5 0 0 2
parameter.
replace
And now we can verify that without replacement, the maximum number of times any sample is included is once.
set.seed(1)
rf2 <- randomForest(Species ~ ., iris, keep.inbag = TRUE, replace = FALSE)