Question

我有几个randomForest模型，它们具有相同的自变量（X），不同的因变量（Y）。有没有办法在没有集群的情况下高效运行这些模型？目前我使用lapply按顺序运行它们。

library(randomForest)
data(iris)

iris_new<-iris[,c(1:ncol(iris),rep(ncol(iris),4))]

tt<-lapply(1:5, function (i){
iris.rf <- randomForest(as.formula(paste(colnames(iris_new)[4+i],'.',sep='~')), data=iris_new,  importance=TRUE)
iris.pred <- predict(iris.rf, iris_new[i,])  

})

在上面的例子中，有5个随机森林模型。前四列是不同模型的自变量。最后5列是每个模型的因变量。

Answer 1

一种可能的解决方案是使用来自并联封装的多巴。它将允许您使用所有核心来种植树木。如果你有四个核心，你可以使用它们各自生成一个树，然后使用包的组合功能将它们组合起来。

如何在没有集群的情况下有效地运行randomForest

1 个答案: