Question

当mclapply(X, FUN)遇到某些X值的错误时，错误会传播到X的其他一些值（但不是全部）：

require(parallel)
test <- function(x) if(x == 3) stop() else x
mclapply(1:3, test, mc.cores = 2)

#[[1]]
#[1] "Error in FUN(c(1L, 3L)[[2L]], ...[cut]
#
#[[2]]
#[1] 2
#
#[[3]]
#[1] "Error in FUN(c(1L, 3L)[[2L]], ... [cut]

#Warning message:
#In mclapply(1:3, test, mc.cores = 2) :
#  scheduled core 1 encountered error in user code, all values of the job will be affected

如何阻止这种情况发生？

Answer 1

诀窍是设置mc.preschedule = FALSE

mclapply(1:3, test, mc.cores = 2, mc.preschedule = FALSE)
#[[1]]
#[1] 1

#[[2]]
#[1] 2

#[[3]]
#[1] "Error in FUN(X[[nexti]], ...[cut]
#Warning message:
#In mclapply(1:3, test, mc.cores = 2, mc.preschedule = FALSE) :
#  1 function calls resulted in an error

这是有效的，因为默认情况下mclapply似乎将X划分为mc.cores组，并将FUN的矢量化版本应用于每个组。因此，如果组中的任何成员产生错误，则该组中的所有值都将产生相同的错误（但其他组中的值不受影响）。

设置mc.preschedule = FALSE会产生不利影响，并且可能无法重现一系列伪随机数，其中同一作业始终在序列中接收相同的数字，请参阅标题?mcparallel >随机数。

一个作业中的错误会使mclapply污染其他作业

1 个答案: