内存(RAM)使用与光栅包相交的问题

时间:2018-01-08 13:53:51

标签: r r-raster sp sf

我无法在R上获得两个大型SpatialPolygonsDataFrame之间的交集。我的多边形数据代表建筑物和管理边界,我试图获得它们之间的交叉多边形。

据我所知,raster包中的intersect函数和rgeos包中的gIntersection可以完成这项工作(有一些差别)但是它们不能同时处理我的所有多边形(大约50.000个多边形/实体)。

出于这个原因,我必须在一个循环中拆分我的计算,保存每一步的结果。问题是:这些功能不断填满我的物理内存,我无法清理它。我尝试使用rm()和gc(),但它没有改变一件事。内存问题导致我的R会话崩溃,我无法进行计算。

在循环中有没有办法在模拟过程中释放RAM?或者为了避免这种记忆问题?

这是一个可重现的例子,用于随机多边形。

library(raster)
library(sp)
library(rgeos)

#Generating 50000 points (for smaller polygons) and 150000 (for larger polygons) in a square of side 100000
size=100000

Nb_points1=50000
Nb_points2=150000
start_point=matrix(c(sample(x = 1:size,size = Nb_points1,replace = T),sample(x = 1:size,size = Nb_points1,replace = T)),ncol=2)
start_point2=matrix(c(sample(x = 1:size,size = Nb_points2,replace = T),sample(x = 1:size,size = Nb_points2,replace = T)),ncol=2)

#Defining different sides length
radius=sample(x = 1:50,size = Nb_points1,replace = T)
radius2=sample(x = 1:150,size = Nb_points2,replace = T)

#Generating list of polygons coordinates
coords=list()
for(y in 1:Nb_points1){
  xmin=max(0,start_point[y,1]-radius[y])
  xmax=min(size,start_point[y,1]+radius[y])
  ymin=max(0,start_point[y,2]-radius[y])
  ymax=min(size,start_point[y,2]+radius[y])
  coords[[y]]=matrix(c(xmin,xmin,xmax,xmax,ymin,ymax,ymax,ymin),ncol=2)
}

coords2=list()
for(y in 1:Nb_points2){
  xmin=max(0,start_point2[y,1]-radius2[y])
  xmax=min(size,start_point2[y,1]+radius2[y])
  ymin=max(0,start_point2[y,2]-radius2[y])
  ymax=min(size,start_point2[y,2]+radius2[y])
  coords2[[y]]=matrix(c(xmin,xmin,xmax,xmax,ymin,ymax,ymax,ymin),ncol=2)
}

#Generating 75000 polygons
Poly=SpatialPolygons(Srl = lapply(1:Nb_points1,function(y) Polygons(srl = list(Polygon(coords=coords[y],hole = F)),ID = y)),proj4string = CRS('+init=epsg:2154'))
Poly2=SpatialPolygons(Srl = lapply(1:Nb_points2,function(y)Polygons(srl =  list(Polygon(coords=coords2[y],hole = F)),ID = y)),proj4string = CRS('+init=epsg:2154'))

#Union of overlapping polygons
aaa=gUnionCascaded(Poly)
bbb=gUnionCascaded(Poly2)

aaa=disaggregate(aaa)
bbb=disaggregate(bbb)

intersection=gIntersects(spgeom1 = aaa,bbb,byid = T,returnDense = F)

#Loop on the intersect function
pb <- txtProgressBar(min = 0, max = ceiling(length(aaa)/1000), style = 3)

for(j in 1:ceiling(length(aaa)/1000)){
  tmp_aaa=aaa[((j-1)*1000+1):(j*1000),]
  tmp_bbb=bbb[unique(unlist(intersection[((j-1)*1000+1):(j*1000)])),]
  List_inter=intersect(tmp_aaa,tmp_bbb)
  gc()
  gc()
  gc()
  setTxtProgressBar(pb, j)
}

谢谢!

2 个答案:

答案 0 :(得分:2)

您可以考虑使用包st_intersects的{​​{1}}和st_intersection功能。例如:

sf

将为您提供aaa2 <- sf::st_as_sf(aaa) bbb2 <- sf::st_as_sf(bbb) intersections_mat <- sf::st_intersects(aaa2, bbb2) intersections <- list() for (int in seq_along(intersections_mat)){ if (length(intersections_mat[[int]]) != 0){ intersections[[int]] <- sf::st_intersection(aaa2[int,], bbb2[intersections_mat[[int]],]) } } ,其长度等于intersection_mat,并且对于aaa的每个要素,包含&#34;索引&#34;它与之相交的aaa个元素(&#34;空&#34;如果没有找到交叉点):

bbb

,以及包含相交多边形列表的> intersections_mat Sparse geometry binary predicate list of length 48503, where the predicate was `intersects' first 10 elements: 1: 562 2: (empty) 3: 571 4: 731 5: (empty) 6: (empty) 7: (empty) 8: 589 9: 715 10: (empty) 列表:

intersection

(即>head(intersections) [[1]] Simple feature collection with 1 feature and 0 fields geometry type: POLYGON dimension: XY bbox: xmin: 98873 ymin: 33 xmax: 98946 ymax: 98 epsg (SRID): 2154 proj4string: +proj=lcc +lat_1=49 +lat_2=44 +lat_0=46.5 +lon_0=3 +x_0=700000 +y_0=6600000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs geometry 1 POLYGON ((98873 33, 98873 9... [[2]] NULL [[3]] Simple feature collection with 1 feature and 0 fields geometry type: POLYGON dimension: XY bbox: xmin: 11792 ymin: 3 xmax: 11806 ymax: 17 epsg (SRID): 2154 proj4string: +proj=lcc +lat_1=49 +lat_2=44 +lat_0=46.5 +lon_0=3 +x_0=700000 +y_0=6600000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs geometry 1 POLYGON ((11792 3, 11792 17... intersections[[1]]的多边形1与aaa的多边形571之间的交点

HTH。

答案 1 :(得分:1)

在对循环进行一些更改后,该示例对我(8 GB RAM)很好。见下文。这些更改与内存使用无关 - 您没有存储结果。

List_inter <- list()

for(j in 1:ceiling(length(aaa)/1000)){
    begin <- (j-1) * 1000 + 1
    end <- min((j*1000), length(aaa))
    tmp_aaa <- aaa[begin:end,]
    tmp_bbb <- bbb[unique(unlist(intersection[begin:end])),]
    List_inter[[j]] <- intersect(tmp_aaa,tmp_bbb)
    cat(j, "\n"); flush.console()
}

x <- do.call(bind, List_inter)

或者,您可以将中间结果写入磁盘,稍后再处理它们:

inters <- intersect(tmp_aaa,tmp_bbb)
saveRDS(inters, paste0(j, '.rds'))

shapefile(inters, paste0(j, '.shp'))