我无法让parRF
工作,即使parApply
之类的其他内容工作正常。
我尝试了makeCluster
以及makePSOCKcluster
以及类似的一些变体。
不断返回错误task 1 failed - could not find function getDoParWorkers
cores_2_use <- detectCores() - 2
cl <- makeCluster(cores_2_use, useXDR = F)
clusterSetRNGStream(cl, 9956)
registerDoParallel(cl, cores_2_use)
rf_train <- train(y=y, x=x,
method='parRF', tuneGrid = data.frame(mtry = ncol(x)), na.action = na.omit,
trControl=trainControl(method='oob',number=10, allowParallel = TRUE)
)
Error in { : task 1 failed - "could not find function "getDoParWorkers""
答案 0 :(得分:4)
我可以重现您的错误消息。解决它需要一点点黑客攻击。我不确定这是一个错误还是别的什么。
但我设法通过复制模型和调整拟合函数来实现它。我在fit函数中添加了<rule name="HTTPS2Admins" enabled="true" stopProcessing="true">
<match url="/ecards/user(.*)" />
<conditions>
<add input="{HTTP}" pattern="on" />
</conditions>
<action type="Redirect" url="https://{HTTP_HOST}/ecards/user{R:1}" appendQueryString="true" redirectType="Permanent" />
</rule>
。
奇怪的是,一旦列车模型以新的parRF_Mod作为方法运行,出现错误的原始列车工作没有任何错误。从干净的会话开始,再次出现错误。所以某些事情不应该如此。
require(foreach)
我的sessionInfo:
library(doParallel)
cl = makeCluster(parallel::detectCores()-1, type = "SOCK")
registerDoParallel(cl)
getDoParWorkers()
library(caret)
library(randomForest)
y <- mtcars$mpg
x <- mtcars[, -mtcars$mpg ]
parRF_mod <- getModelInfo("parRF", regex = FALSE)[[1]]
parRF_mod$fit <- function (x, y, wts, param, lev, last, classProbs, ...)
{
# added the requirement of foreach
require(foreach)
workers <- getDoParWorkers()
theDots <- list(...)
theDots$ntree <- if (is.null(theDots$ntree))
250
else theDots$ntree
theDots$x <- x
theDots$y <- y
theDots$mtry <- param$mtry
theDots$ntree <- ceiling(theDots$ntree/workers)
out <- foreach(ntree = 1:workers, .combine = combine) %dopar%
{
library(randomForest)
do.call("randomForest", theDots)
}
out$call["x"] <- "x"
out$call["y"] <- "y"
out
}
rf_train <- train(y=y, x=x,
method=parRF_mod, tuneGrid = data.frame(mtry = ncol(x)), na.action = na.omit,
trControl=trainControl(method='oob',number=10, allowParallel = TRUE)
)
stopcluster(cl)
答案 1 :(得分:4)
更新: Topepo已更新Github上的代码以修复此错误!只需install_github("/topepo/caret/pkg/caret/")
我的先前答案已被弃用
某人from Github也提出了此解决方法:
# parallel
require(caret); library(doParallel);
cl <- makePSOCKcluster(detectCores());
clusterEvalQ(cl, library(foreach)); registerDoParallel(cl)
y <- mtcars$mpg; x <- mtcars[, -mtcars$mpg];
#--------------------------------------------------------------
rf_train <- train(y=y, x=x,
method='parRF', tuneGrid = data.frame(mtry = ncol(x)), na.action = na.omit,
trControl=trainControl(method='oob',number=10, allowParallel = TRUE)
)
rf_train
#--------------------------------------------------------------
stopCluster(cl);
在运行此版本的代码之前,请务必重新开始。即使在另一次尝试parRF后stopCluster(cl)
和stopImplicitCluster()
之后,这个方法对我来说也不起作用,直到我完全重启R和RStudio。