我准备了一个小基准来比较多个工人用于处理时的速度增益。
library(foreach)
library(doParallel)
f <- function(n){ #just a function that needs some time
sample(n, 1000000, replace = T)
sqrt(n)
}
before = Sys.time()
y = foreach(i=1:10000, .combine = c) %do% {
f(i)
}
after = Sys.time()
after - before # Time difference of 1.764439 mins (105.86634s)
benchmark <- function(n){
f <- function(n){
sample(n, 1000000, replace = T)
sqrt(n)
}
cl <- makeCluster(n)
registerDoParallel(cl)
before = Sys.time()
y = foreach(i=1:10000, .combine = c) %dopar% {
f(i)
}
after = Sys.time()
stopCluster(cl)
after - before
}
benchmark(1) #Time difference of 1.49561 mins (89.7366s)
benchmark(2) #Time difference of 44.76028 secs
benchmark(3) #Time difference of 32.92803 secs
benchmark(4) #Time difference of 28.984 secs
benchmark(5) #Time difference of 27.46466 secs
benchmark(6) #Time difference of 26.48457 secs
benchmark(7) #Time difference of 26.48769 secs
benchmark(8) #Time difference of 29.07004 secs
我预计%dopar%
处理时间(仅使用1名工作人员)与%do%
处理时间相似。但是%do%
需要大约16秒。这是正常的吗?