I'm trying to parallelize this example.
I have a bunch of rasters that I am trying to aggregate by week of the year. Here is what this looks like in series:
# create a raster stack from list of GeoTiffs
tifs <- list.files(path = "./inputData/", pattern = "\\.tif$", full.names = TRUE)
r <- stack(tifs)
# get the date from the names of the layers and extract the week
indices <- format(as.Date(names(r), format = "X%Y.%m.%d"), format = "%U")
indices <- as.numeric(indices)
# calculate weekly means
r_week <- stackApply(r, indices, function(x) mean(x, na.rm = TRUE))
This is my attempt at parallelization using snow
and pbapply
.
# aggregate rasters in parallel
no_cores <- parallel::detectCores() - 1
tryCatch({
cl <- snow::makeCluster(no_cores, "SOCK")
snow::clusterEvalQ(cl, {
require(pacman)
p_load(dplyr
,rts
,raster
,stringr
,pbapply
,parallel)
})
parallel::clusterExport(cl = cl, varlist = list("r", "indices"))
r_week <- pbapply::pbsapply(r, indices, stackApply(r, indices, function(x) mean(x, na.rm = TRUE)), simplify = TRUE, USE.NAMES = TRUE, cl = cl)
snow::stopCluster(cl)
}, error=function(e){
snow::stopCluster(cl)
return(e)
}, finally = {
try(snow::stopCluster(cl), silent = T)
})
The stackApply()
method does not take a cluster argument, so I'm trying to wrap it in a pbsapply()
. This returns the following error:
<simpleError in get(as.character(FUN), mode = "function", envir = envir): object 'indices' of mode 'function' was not found>
答案 0 :(得分:1)
我认为我找到了使用raster::clusterR()
方法的解决方法。它没有提供进度条。很高兴看到有人知道如何使用snow
和pbapply
来做到这一点。
tryCatch({
system.time({
no_cores <- parallel::detectCores() - 1
raster::beginCluster(no_cores)
myFun <- function(x, ...) {
mean(!is.na(x))
}
r_week <- raster::clusterR(r, stackApply, args=list(indices = indices, fun = myFun, na.rm = TRUE))
raster::endCluster()})
}, error = function(e) {
raster::endCluster()
return(e)
}, finally = {
try(raster::endCluster())
})
答案 1 :(得分:0)
尝试将progress='text'
添加到stackApply
参数中。在非并行版本中,它可以正常工作。祝你好运!