作为随机森林世界的新手,我有一个(基本)问题。继赵,苏,婷婷和范(2016)之后,我正在尝试以下程序:
For b = 1, ..., B, do
Using all data, grow a tree with treatment as output and all other covariates as inputs.
Compute % of treated observations s_i for the terminal node to which i belongs.
Update S_i = S_i + s_i
end
prop_score = S_i / B
要运行此算法,我使用randomForest
函数:
model1 = randomForest(treatment ~ ., data = dataset, keep.forest = T, ntree = 100, na.action = na.omit)
话虽如此,您能告诉我如何获取每个 i 属于某个终端节点的信息吗?还有,任何推荐的包处理相当大的数据?除非我将树木数量限制为少量,否则randomForest
会冻结。