我正在尝试优化一个函数,我将使用数百个单元格的几个栅格,所以我想并行化这个函数。
所以这是最初的栅格:
library(raster)
SPA <- raster(nrows=3, ncols=3, xmn = -10, xmx = -4, ymn = 4, ymx = 10)
values(SPA) <- c(0.1, 0.4, 0.6, 0, 0.2, 0.4, 0, 0.1, 0.2)
plot(SPA)
该函数的目标是获取一个数据帧,其中栅格中存在的所有单元格之间的距离为列,列数和列距离。
为了做到这一点,我使用gdistance包创建了一个过渡层:
library(gdistance)
h16 <- transition(SPA, transitionFunction=function(x){1},16,symm=FALSE)
h16 <- geoCorrection(h16, scl=FALSE)
和每个细胞的原点:
B <- xyFromCell(SPA, cell = 1:ncell(SPA))
head(B)
x y
[1,] -9 9
[2,] -7 9
[3,] -5 9
[4,] -9 7
[5,] -7 7
[6,] -5 7
在一些stackoverflow答案的帮助下,我做了这个功能,比gdistance中的accCost更快
accCost2 <- function(x, fromCoords) {
fromCells <- cellFromXY(x, fromCoords)
tr <- transitionMatrix(x)
tr <- rBind(tr, rep(0, nrow(tr)))
tr <- cBind(tr, rep(0, nrow(tr)))
startNode <- nrow(tr)
adjP <- cbind(rep(startNode, times = length(fromCells)), fromCells)
tr[adjP] <- Inf
adjacencyGraph <- graph.adjacency(tr, mode = "directed", weighted = TRUE)
E(adjacencyGraph)$weight <- 1/E(adjacencyGraph)$weight
return(shortest.paths(adjacencyGraph, v = startNode, mode = "out")[-startNode])
}
使用apply我得到我想要的data.frame
connections <- data.frame(from = rep(1:nrow(B), each = nrow(B)),to = rep(1:nrow(B), nrow(B)), dist =as.vector(apply(B,1, accCost2, x = h16)))
head(connections)
from to dist
1 1 1 0.0
2 1 2 219915.7
3 1 3 439831.3
4 1 4 221191.8
5 1 5 312305.7
6 1 6 493316.1
library("parallel")
cl = makeCluster(3)
clusterExport(cl, c("B", "h16", "accCost2"))
clusterEvalQ(cl, library(gdistance), library(raster))
connections <- data.frame(from = rep(1:nrow(B), each = nrow(B)),to = rep(1:nrow(B), nrow(B)), dist =as.vector(parRapply(cl, B,1, accCost2, x = h16)))
stopCluster(cl)
但是我收到以下错误:
Error in x[i, , drop = FALSE] : object of type 'S4' is not subsettable
我在并行化方面相当新,而且我不确定我做错了什么
答案 0 :(得分:3)
您的代码中存在多个语法问题。
此代码适用于我。
library("parallel")
accCost_wrap <- function(x){accCost2(h16,x)}
#Instead of including h16 in the parRapply function,
#just get it in the node environment
cl = makeCluster(3)
clusterExport(cl, c("h16", "accCost2"))
#B will be "sent" to the nodes through the parRapply function.
clusterEvalQ(cl, {library(gdistance)})
#raster is a dependency of gdistance, so no need to include raster here.
pp <- parRapply(cl, x=B, FUN=accCost_wrap)
stopCluster(cl)
connections <- data.frame(from = rep(1:nrow(B), each = nrow(B)),
to = rep(1:nrow(B), nrow(B)),
dist = as.vector(pp))
您的accCost版本确实比gdistance中的版本更快。您的版本会省略检查以查看您的积分是否在过渡图层范围内。谨慎行事。
(您可以通过将单元格编号作为输入来使您的功能更快。此外,从每个节点发回大量数据似乎效率不高。)