我一直在使用extract
包中的raster
函数,使用shapefile定义的区域从栅格文件中提取数据。但是,我遇到了此过程现在需要的内存量问题。我确实有大量的shapefile(~1000)。光栅文件很大(~1.6gb)
我的流程是:
shp <- mclapply(list.files(pattern="*.shp",full.names=TRUE), readShapePoly,mc.cores=6)
ndvi <- raster("NDVI.dat")
mc<- function(y) {
temp <- gUnionCascaded(y)
extract <- extract(ndvi,temp)
mean <- range(extract, na.rm=T )[1:2]
leng <- length(output)
}
output <- lapply(shp, mc)
我可以做些什么来减少内存负载吗?我尝试加载较少的shapefile,在记忆再次刺激之前工作约5分钟。它的四核计算机2.4ghz,8gb ram
答案 0 :(得分:4)
我会这样做(未经测试):
## Clearly we need these packages, and their dependencies
library(raster)
library(rgeos)
shpfiles <- list.files(pattern="*.shp",full.names=TRUE)
ndvi <- raster("NDVI.dat")
## initialize an object to store the results for each shpfile
res <- vector("list", length(shpfiles))
names(res) <- shpfiles
## loop over files
for (i in seq_along(shpfiles)) {
## do the union
temp <- gUnionCascaded(shpfiles[i])
## extract for this shape data (and don't call it "extract")
extracted <- extract(ndvi,temp)
## further processing, save result
mean <- range(extracted, na.rm = TRUE )[1:2]
res[[i]] <- mean ## plus whatever else you need
}
上面的mc()的返回值是什么并不清楚,所以我忽略了它。这将比您最初尝试的内存高效且快速。我怀疑在这里使用并行的东西是值得的。