在R问题中并行运行

时间:2015-06-01 17:19:24

标签: r parallel-processing

我尝试用并行包在R中测试并行。但在我的示例中(如下面的代码所示),并行任务的时间大于单个任务的时间。有人可以给我一些建议吗?

非常感谢!

##parSquareNum.R
strt <- Sys.time()
workerFunc <- function(n) { return(n^2) }
values <- 1:1000000
library(parallel)
## Number of workers (R processes) to use:
cores <- detectCores()
## Set up the ’cluster’
cl <- makeCluster(cores-1)
## Parallel calculation (parLapply):
res <- parLapply(cl, values, workerFunc)
## Shut down cluster
write(Sys.time()-strt, 'parallel.txt')
stopCluster(cl)

##singleSquareNum.R

## The worker function to do the calculation:
strt <- Sys.time()
workerFunc <- function(n) { return(n^2) }
## The values to apply the calculation to:
values <- 1:1000000
## Serial calculation:
res <- lapply(values, workerFunc)
##print(unlist(res))
write(Sys.time() -strt, 'single.txt')

1 个答案:

答案 0 :(得分:3)

您看到这个的主要原因是因为加载库以及使集群需要一些时间。在strt <- Sys.time()之前将res移到右侧,您会看到差异,特别是如果您增加values的值

##parSquareNum.R
workerFunc <- function(n) { return(n^2) }
values <- 1:1000000
library(parallel)
## Number of workers (R processes) to use:
cores <- detectCores()
## Set up the ’cluster’
cl <- makeCluster(cores-1)
## Parallel calculation (parLapply):
strt <- Sys.time()
res <- parLapply(cl, values, workerFunc)
write(Sys.time()-strt, 'parallel.txt')
## Shut down cluster
stopCluster(cl)

##singleSquareNum.R

## The worker function to do the calculation:
workerFunc <- function(n) { return(n^2) }
## The values to apply the calculation to:
values <- 1:1000000
## Serial calculation:
strt <- Sys.time()
res <- lapply(values, workerFunc)
##print(unlist(res))
write(Sys.time() -strt, 'single.txt')

当我跑步时,平行获得0.6941409秒,单身获得1.117002秒。 1.6倍加速。我在i7芯片上运行。