在R中使用doparallel和foreach进行嵌套循环并连续打印输出

时间:2017-09-14 00:27:07

标签: r foreach nested doparallel

我正在尝试计算多种基因表达组合的AUC。我有〜2000个上调基因和~1500个下调基因。我需要对我的代码使用parallel,因为计算时间很长。我尝试稍微改变我的代码(下面),但输出不像我想要的那样。

期望的输出:

{pos.Gene1} {neg.gene1},{pos.gene2} {neg.gene2} x1 x2 x3 x4

{pos.Gene1} {neg.gene1},{pos.gene2} {neg.gene3} x1 x2 x3 x4

...

我得到的输出: {pos.Gene1} {neg.gene1},{pos.gene2} {neg.gene2} x1 x2 x3 x4 {pos.Gene2} {neg.gene4},{pos.gene3} {neg.gene4} x1 x2 x3 x4

基本上,不是我正在寻找的顺序而且格式不正确。

我是使用R进行并行计算的新手。你能帮我修改代码吗?

谢谢!

library(foreach)
library(doParallel)

#setup parallel backend to use many processors
cores=detectCores()
cl <- makeCluster(cores[1]-1) #not to overload your computer
registerDoParallel(cl)

validation = readRDS("file with 5 lists each having sublists")

path = "XXX"
source(paste0(path,"Functions.R")) # importing functions to use in the script


genes = readRDS("list with 2 sublists")
pos.genes = genes$pos # list of upregulated genes
neg.genes = genes$neg # list of downregulated genes

mat = matrix(nrow = 1, ncol = 5) #creating empty matrix
fileConn = "validation_parallel.tsv" 

foreach( i=1 : length(pos.genes)%:%  
{
  foreach( j=1: length(neg.genes) %:%
  {
    foreach( k=(j+1): length(neg.genes) %dopar%
    {

      score <- function(data){######} # function that returns a matrix of 1 row and 4 columns having numbers 

      pair1 = paste0("{",c(pos.genes[i], pos.genes[i+1]),"}","{",c(neg.genes[j],neg.genes[k]),"}",collapse = ',')

      # for each iteration in k, calculating score and continuously writing output to a file
      mat[1,1] = pair1
      mat[1,2] = score[1,1]
      mat[1,3] = score[1,2]
      mat[1,4] = score[1,3]
      mat[1,5] = score[1,4]

      cat(mat,file = fileConn, sep = "\t", append = TRUE)
      cat(file = fileConn, sep = "\n", append = TRUE)     
    }
  }
}
#stop cluster
stopCluster(cl)

0 个答案:

没有答案