R shortest.paths for循环

时间:2015-01-06 06:27:46

标签: r performance igraph shortest-path

我有一个任务是重复函数shortest.paths,但我的输入太大。我想知道如何快速实现。我所知道的, igraph 是最佳选择。我的内核如下:

首先,我有一个真实的网络和1000个随机网络作为'列表'格式

library(igraph)
# real ppi network
REAL.PPI <- paste0(RANDM, "real.ppi.txt")
real.ppi <- graph.data.frame(read.table(REAL.PPI, header = F), directed = F)
ppi.gs <- V(real.ppi)$name
# random network
random_net_names <- dir(paste0(RANDM, "randomnetwork"))
random_nets <- lapply(random_net_names, function(x){
  path <- paste(RANDM, "randomnetwork/", x, sep="")
  rn <- read.table(path, header = F)
  graph.data.frame(rn, directed = F)
}

然后,我必须比较实际和随机网络中的节点集'最短路径。为此,我选择for-loop而不是apply-function,因为后者不会更快。

输入格式为:

hsa-let-7a-2-3p hsa-let-7a-3p   GO:0001702  4040    10818   4089
hsa-let-7a-2-3p hsa-let-7a-3p   GO:0001764  27185   2625    5048    429 6695

我的内核如下:

# input
ovr <- strsplit(readLines("ovr.txt"), '\t')
# for-loop
OUT.final <- outfile("out.txt", "w")
for(i in 1 : length(ovr)){
  hyper.ppi.ovr <- ovr[[i]][- c(1, 2, 3)]
  D1 <- shortest.paths(real.ppi, hyper.ppi.ovr, hyper.ppi.ovr)
  CPLs <- sapply(random_nets, function(r_net){
    sum(shortest.paths(r_net, hyper.ppi.ovr, hyper.ppi.ovr)) 
  }
  )
  D2.p <- sum(CPLs < D1) / 1000
  if(D2.p > .01)next
  # output
  out.value <- ovr[[i]]
  cat(out.value, sep = "\t", file = OUT.final)
  cat("\n", file = OUT.final)
}
close(OUT.final)

0 个答案:

没有答案