Question

我在网格上运行一系列大型模拟。我正在按行实现模拟，我发现我的采样函数是一个瓶颈。我试图使用foreach和doMC库来加速这个过程，但是我发现并行方法比较慢或者我无法编写一个可以被foreach正确解释的函数。

看一些其他帖子，看来我使用foreach的方法可能会被误导，因为我尝试的工作数量大大超过了可用处理器的数量。我想知道人们是否会对如何在我的情况下最好地实现并行化提出一些建议。我的模拟通常有两种类型。在第一个中，我计算了一个矩阵，其中包含我正在处理的网格行中每个元素的采样间隔（行）。然后我使用runif进行采样（在实际模拟中，我的行包含~9000个单元格，我正在执行10000次模拟）。

#number of simulations per element 
n = 5

#Generate an example sampling interval.
m.int1 <- matrix ( seq ( 1, 20, 1 ), ncol=10, nrow=2 )

#Define a function to sample over the interval defined in m.int1
f.rand1 <- function(a) {
return ( runif ( n, a[1], a[2] ) )
}

#run the simulation with each columns corresponding to the row element and rows 
#the simultions.
sim1 <- round( apply ( m.int1, 2, f.rand1 ) )

在第二种情况下，我试图从一组经验分布中进行抽样，这些分布由矩阵中的列索引。 grid-row元素的值对应于要采样的列。

#number of simulations per element 
n = 5

#generate a vector represeting a row of grid values 
v.int2 <- round(runif(10,1,3))

#define matrix of data that contains the distributions to be sampled.
m.samples<-cbind(rep(5,10),rep(4,10),rep(3,10))  

f.sample <- function(a) {
return ( sample ( m.samples [ ,a], n, ) )
}

#Sample m.samples indexed by column number.
sim2<- sapply(v.int2,f.sample)

在第二个例子中，我能够利用 foreach（）和％dopar％并行运行，但模拟花费的时间比串行代码长得多。在上面第一个例子的情况下，我无法编写适当的函数来利用foreach并行化。我将在第二种情况下使用的代码只是为了证明我的想法 - 但我现在意识到我的方法开销过于昂贵。

library(foreach)
library(doMC)
registerDoMC(2)

n = 5

#Sample m.samples indexed by column number using parallel method.
sim2.par <- foreach ( i = 1 : length ( v.int2 ), 
    .combine="cbind") %dopar% sample ( 
     m.samples [ , v.int2 [i] ] , n )

我很欣赏一些有助于我有效利用并行化的方法（和一些代码！）的建议。同样，我正在处理的行通常包含大约9000个元素，并且我们每个元素进行10000次模拟。所以我的输出模拟矩阵一般在10000 X 9000的数量级。感谢您的帮助。

Answer 1

这是您第一次模拟的略微改进。较大的n可能会在运行时产生更大的收益。

> n <- 1000
> m.int1 <- matrix ( seq ( 1, 20, 1 ), ncol=10, nrow=2 )
> f.rand1 <- function(a) {
+    return(runif(n, a[1], a[2]))
+ }
> system.time(x1 <- replicate(n, round(apply(m.int1, 2, f.rand1))))
   user  system elapsed 
   2.84    0.06    2.95 
> system.time(x2 <- replicate(n, matrix(round(runif(n*10, min = m.int1[1, ], max = m.int1[2, ])), ncol = 10, byrow = TRUE)))
   user  system elapsed 
   2.48    0.06    2.61 
> head(x1[,,1])
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    1    4    5    7   10   12   13   16   17    20
[2,]    1    3    6    7   10   11   13   16   17    19
[3,]    1    3    6    7   10   12   14   16   18    20
[4,]    2    4    5    7    9   12   14   16   17    19
[5,]    1    4    5    7   10   12   14   16   17    20
[6,]    1    4    6    8    9   11   13   15   18    20
> head(x2[,,1])
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    2    4    6    7    9   12   14   16   17    20
[2,]    1    3    6    8   10   12   14   15   18    20
[3,]    2    4    5    7    9   11   13   15   17    20
[4,]    2    3    5    7    9   11   14   15   17    19
[5,]    2    3    6    7    9   12   13   16   17    20
[6,]    2    4    6    7   10   12   14   16   17    20

Answer 2

尝试使用此而不是两步过程。它会跳过apply步骤：

f.rand2 <- function(a) {
  matrix( runif ( n*ncol(a), rep(a[1,], n) , rep(a[2,], n) ), nrow=ncol(a) )
                    }

f.rand2(m.int1)
           [,1]      [,2]      [,3]      [,4]      [,5]
 [1,]  1.693183  1.404336  1.067888  1.904476  1.161198
 [2,]  3.411118  3.852238  3.621822  3.969399  3.318809
 [3,]  5.966934  5.466153  5.624387  5.646181  5.347473
 [4,]  7.317181  7.106791  7.403022  7.442060  7.161711
 [5,]  9.491231  9.656023  9.518498  9.569379  9.812931
 [6,] 11.843074 11.594308 11.706276 11.744094 11.994256
 [7,] 13.375382 13.599407 13.416135 13.634053 13.539246
 [8,] 15.948597 15.532356 15.692132 15.442519 15.627716
 [9,] 17.856878 17.208313 17.804288 17.875288 17.232867
[10,] 19.214776 19.689534 19.732680 19.813718 19.866297

对我而言，它将时间缩短了一半：

> system.time(x1 <- replicate(n, round(apply(m.int1, 2, f.rand1))))
   user  system elapsed 
  1.088   0.470   1.550 

> system.time(x1 <- replicate(n, f.rand2(m.int1)))
   user  system elapsed 
  0.559   0.256   0.811

R中网格上的大型模拟的并行化

2 个答案: