在R中的多个列上展开循环

时间:2015-06-26 04:57:48

标签: r

我有一张桌子(mydf),如下所示。我想在R中使用这个for循环(我的代码),它只适用于一个列(在本例中为ALT1列)循环遍历包含ALT1到ALTn的所有列,并将输出存储在从final1到finaln的单独变量中。 这里的目的是循环ALT1到ALTn以匹配核苷酸列(A,C,G,T,N)并获得相应的值,如下面的结果所示。谢谢你的帮助!

  

mycode的

      final1 <- {}
i <- 1
result =merge(coverage.bam, rows.concat.alt, by="start")

for(i in 1:nrow(result)){
  final1[i] = paste(paste(result$chr[i], result$start[i], result$end[i],sep=":"),"-", 
                   result$REF[i],"(",result[,(as.character(result$REF[i]))][i],")",",", result$ALT1[i],
                   "(",result[,(as.character(result$ALT1[i]))][i][!is.na(result[,(as.character(result$ALT1[i]))][i])],")",sep="")

}

final1

我试图通过ALTn扩展此ALT代码,但它不起作用,你能帮我解决这个问题吗?

final <- list()
setValue<-function(element){
  print(element)
  for(i in 1:nrow(result)){
    final[[i]] = paste(paste(result$chr[i], result$start[i], result$end[i],sep=":"),"-", 
                     result$REF[i],"(",result[,(as.character(result$REF[i]))][i],")",",", result[,element][i],
                     "(",result[,(as.character(result[,element][i])))][i][!is.na(result[,(as.character(result[,element][i])][i])],")",sep="")

  }
}
for(i in colnames(result)){
  if(grepl('ALT', i)){
   setValue(i) 
  }
}
  

是myDF

    chr     start       end  A  C  G  T  N  =  - REF ALT ALT1 ALT2 ALT3 ALTn          
1 chr10 102022031 102022031 NA 34 NA NA NA NA NA   C   G    G NA NA NA       
2 chr10 102220574 102220574  2 22  2  3 NA NA NA   C AGT    A    G    T NA       
3 chr10 115322228 115322228 NA 25 NA NA NA NA NA   C   A    A NA NA NA       
4 chr10 122222925 122222925 30 NA NA NA NA NA NA   A   C    C NA NA NA 
5 chr10 121111042 121111042 NA 48 NA NA NA NA NA   C   T    T NA NA NA 
6 chr10 124444484 124444484 NA 60 NA NA NA NA NA   C   T    T NA NA NA 
  

结果

"chr10:102022031:102022031-C(34),G()"            "chr10:102220574:102220574-C(22),A(2),G(2),T(3)" "chr10:115322228:115322228-C(25),A()"           
      [4] "chr10:122222925:122222925-A(30),C()"            "chr10:121111042:121111042-C(48),T()"            "chr10:124444484:124444484-C(60),T()"

1 个答案:

答案 0 :(得分:1)

尝试

 p1 <- do.call(paste,c(mydf[1:3], sep=":"))
 p2 <- apply(mydf[c(4:8, 11:16)], 1, function(x) {
            Un1 <- unique(match( x[7:11], names(x)[1:4], nomatch=0))
            i1 <- match(x[6], names(x))
            v1 <- paste0(names(x[i1]),'(', x[i1], ')')
            v2 <- as.numeric(x[Un1])
            v2[is.na(v2)] <- ''
            v3 <-paste(names(x[Un1]), '(', v2, ')', sep='', collapse=",")
            paste(v1, v3, sep=",") })

 paste(p1, p2, sep="-")
 #[1] "chr10:102022031:102022031-C(34),G()"           
 #[2] "chr10:102220574:102220574-C(22),A(2),G(2),T(3)"
 #[3] "chr10:115322228:115322228-C(25),A()"           
 #[4] "chr10:122222925:122222925-A(30),C()"           
 #[5] "chr10:121111042:121111042-C(48),T()"           
 #[6] "chr10:124444484:124444484-C(60),T()"