几次迭代后%dopar%误差

时间:2016-02-08 16:03:17

标签: r domc

我一直试图在R中运行一个并行化的foreach循环,它可以正常运行大约十次迭代然后崩溃,显示错误:

Error in { : task 7 failed - "missing value where TRUE/FALSE needed"
Calls: %dopar% -> <Anonymous>
Execution halted

我将每个循环的结果追加到一个文件中,该文件确实显示输出符合预期。我的脚本如下,使用this post中的combn_sub函数:

LBRA <- fread(
 input      = "LBRA.012",
 data.table = FALSE)
str_bra <- nrow(LBRA)

br1sums <- colSums(LBRA)
b1non <- which(br1sums == 0)

LBRA_trim <- LBRA[,-b1non]

library(foreach)
library(doMC)
registerDoMC(28)

foreach(X = seq(2, (nrow(LBRA)-1))) %dopar% {
  com <- combn_sub(
   x    = nrow(LBRA),
   m    = X,
   nset = 1000)

  out_in <- matrix(
   ncol = 2,
   nrow = 1)
   colnames(out) <- c("SNPs", "k")

    for (A in seq(1, ncol(com))){
      rowselect <- com[, A]

      sub <- LBRA_trim[rowselect, ]
      subsum <- colSums(sub)

      length <- length(which(subsum != 0)) - 1
      out_in <- rbind(out_in, c(length, X))
    }

  write.table(
   file   = "plateau.csv",
   sep    = "\t",
   x      = out_in,
   append = TRUE)
}

2 个答案:

答案 0 :(得分:2)

我的foreach电话有类似的问题......

tmpcol <- foreach(j = idxs:idxe, .combine=cbind) %dopar% { imp(j) }

Error in { : task 18 failed - "missing value where TRUE/FALSE needed"

更改.errorhandling参数只会忽略错误

tmpcol <- foreach(j = idxs:idxe, .combine=cbind, .errorhandling="pass") %dopar% { imp(j) }

Warning message:
In fun(accum, result.18) :
  number of rows of result is not a multiple of vector length (arg 2)

我建议在你的foreach调用中运行X = 7的函数。在我的情况下的问题是我的函数imp(j)抛出一个错误(对于j = 18,它正在关联NA计算)导致foreach的模糊输出。

答案 1 :(得分:1)

正如@Roland所指出的,写入foreach循环内的文件是一个非常糟糕的主意。即使以append模式写入,各个内核也会尝试同时写入文件,并且可能会破坏彼此的输入。相反,使用foreach选项捕获.combine="rbind"语句的结果,然后在循环后写入文件:

cluster <- makeCluster(28, outfile="MulticoreLogging.txt");
registerDoMc(cluster);

foreach_outcome_table <- foreach(X = seq(2, (nrow(LBRA)-1)), .combine="rbind") %dopar% {

  print(cat(paste(Sys.info()[['nodename']], Sys.getpid(), sep='-'), "now performing loop", X, "\n"));

  com <- combn_sub(x = nrow(LBRA), m = X, nset = 1000);

  out_in <- matrix(ncol = 2,nrow = 1);

  colnames(out_in) <- c("SNPs", "k");

  for (A in seq(1, ncol(com))){
    rowselect <- com[, A];

    sub <- LBRA_trim[rowselect, ];
    subsum <- colSums(sub);

    length <- length(which(subsum != 0)) - 1;
    out_in <- rbind(out_in, c(length, X));
  }
  out_in;
}
write.table(file = "plateau.csv",sep = "\t", x = foreach_outcome_table, append = TRUE);

此外,您可以使用nested foreach loop替换内部for循环,这可能会更有效。