我有一个任务是重复函数shortest.paths,但我的输入太大。我想知道如何快速实现。我所知道的, igraph 是最佳选择。我的内核如下:
首先,我有一个真实的网络和1000个随机网络作为'列表'格式
library(igraph)
# real ppi network
REAL.PPI <- paste0(RANDM, "real.ppi.txt")
real.ppi <- graph.data.frame(read.table(REAL.PPI, header = F), directed = F)
ppi.gs <- V(real.ppi)$name
# random network
random_net_names <- dir(paste0(RANDM, "randomnetwork"))
random_nets <- lapply(random_net_names, function(x){
path <- paste(RANDM, "randomnetwork/", x, sep="")
rn <- read.table(path, header = F)
graph.data.frame(rn, directed = F)
}
然后,我必须比较实际和随机网络中的节点集'最短路径。为此,我选择for-loop而不是apply-function,因为后者不会更快。
输入格式为:
hsa-let-7a-2-3p hsa-let-7a-3p GO:0001702 4040 10818 4089
hsa-let-7a-2-3p hsa-let-7a-3p GO:0001764 27185 2625 5048 429 6695
我的内核如下:
# input
ovr <- strsplit(readLines("ovr.txt"), '\t')
# for-loop
OUT.final <- outfile("out.txt", "w")
for(i in 1 : length(ovr)){
hyper.ppi.ovr <- ovr[[i]][- c(1, 2, 3)]
D1 <- shortest.paths(real.ppi, hyper.ppi.ovr, hyper.ppi.ovr)
CPLs <- sapply(random_nets, function(r_net){
sum(shortest.paths(r_net, hyper.ppi.ovr, hyper.ppi.ovr))
}
)
D2.p <- sum(CPLs < D1) / 1000
if(D2.p > .01)next
# output
out.value <- ovr[[i]]
cat(out.value, sep = "\t", file = OUT.final)
cat("\n", file = OUT.final)
}
close(OUT.final)