我有一个大的rasterstack (s)
,其中包含以下详细信息:
class : RasterStack
dimensions : 510, 1068, 544680, 19358 (nrow, ncol, ncell, nlayers)
resolution : 0.08333333, 0.08333333 (x, y)
extent : -141, -52, 41, 83.5 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0
names : Jan.1961.1, Jan.1961.2, Jan.1961.3, Jan.1961.4, Jan.1961.5, Jan.1961.6, Jan.1961.7, Jan.1961.8, Jan.1961.9, Jan.1961.10, Jan.1961.11, Jan.1961.12, Jan.1961.13, Jan.1961.14, Jan.1961.15, ...
time : 1961-01-01 - 2013-12-31 (range)
做类似的事情:
writeRaster( s,"PP", overwrite=TRUE, format="CDF", varname="P", varunit="mm",
longname="totals", xname="lon", yname="lat",zname="time",
zunit="numeric")
在我的计算机上完成需要2周以上的时间。如何并行运行(可以通过foreach loop and %dopar% command
)以获得相同的结果和更短的处理时间?
示例数据
s=brick(nrows=510, ncols=1068, xmn=-180, xmx=180, ymn=-90, ymx=90, crs="+proj=longlat +datum=WGS84", nl=193581)
dates=seq(as.Date("1961-01-01"), as.Date("2013-12-31"), by="day")
s<- setZ(s,dates)
注意:我的真实数据不是砖块。
答案 0 :(得分:2)
您可以尝试使用此代码,但我并未真正在大数据集上对其进行测试。我没有测试ncecat
部分...我稍后会更新它,但你可以在此期间尝试。
wd <- "~/Bureau/Tmp"
# stack with 16 layers
nl <- 16 # 19358
s <- brick(nrows = 510, ncols = 1068,
xmn = -180, xmx = 180, ymn = -90, ymx = 90,
crs = "+proj=longlat +datum=WGS84",
nl = nl)
dates <- seq(as.Date("1961-01-01"), as.Date("2013-12-31"), by = "day")
s <- setZ(s, dates)
require(foreach)
require(doParallel)
cl <- makeCluster(4)
registerDoParallel(cl)
tmp <- foreach(i = 1:nlayers(s)) %dopar%
{
r <- raster::raster(s, i)
raster::writeRaster(r,
filename = paste0(wd,
"/PP_", formatC(i, width = 6, flag = "0")),
overwrite=TRUE, format="CDF", varname="P", varunit="mm",
longname="totals", xname="lon", yname="lat",zname="time",
zunit="numeric")
rm(r)
}
stopCluster(cl)
ppfiles <- list.files(wd)[grep("PP_", list.files(wd))]
system(paste0("ncecat ppfiles output.nc")