对于图中的每条边,我想添加一个数值属性(权重),它是事件顶点的属性(概率)的乘积。我可以通过循环边缘来做到这一点;那就是:
for (i in E(G)) {
ind <- V(G)[inc(i)]
p <- get.vertex.attribute(G, name = "prob", index=ind)
E(G)[i]$weight <- prod(p)
}
然而,这对于我的图表来说速度慢(| V |〜= 20,000且| E |〜= 200,000)。有没有更快的方法来执行此操作?
答案 0 :(得分:5)
这可能是最快的解决方案。关键是矢量化。
library(igraph)
G <- graph.full(45)
set.seed(1)
V(G)$prob <- pnorm(vcount(G))
## Original solution
system.time(
for (i in E(G)) {
ind <- V(G)[inc(i)]
p <- get.vertex.attribute(G, name = "prob", index=ind)
E(G)[i]$wt.1 <- prod(p)
}
)
#> user system elapsed
#> 1.776 0.011 1.787
## sapply solution
system.time(
E(G)$wt.2 <- sapply(E(G), function(e) prod(V(G)[inc(e)]$prob))
)
#> user system elapsed
#> 1.275 0.003 1.279
## vectorized solution
system.time({
el <- get.edgelist(G)
E(G)$wt.3 <- V(G)[el[, 1]]$prob * V(G)[el[, 2]]$prob
})
#> user system elapsed
#> 0.003 0.000 0.003
## are they the same?
identical(E(G)$wt.1, E(G)$wt.2)
#> [1] TRUE
identical(E(G)$wt.1, E(G)$wt.3)
#> [1] TRUE
矢量化解决方案的速度似乎要快500倍,尽管需要更多更好的测量来更精确地评估它。
答案 1 :(得分:3)
将我的评论转换为答案。
library(igraph)
# sample data - you should have provided this!!!
G <- graph.full(10)
set.seed(1)
V(G)$prob <- pnorm(rnorm(10))
length(E(G))
# for-loop
for (i in E(G)) {
ind <- V(G)[inc(i)]
p <- get.vertex.attribute(G, name = "prob", index=ind)
E(G)[i]$wt.1 <- prod(p)
}
# sapply
E(G)$wt.2 <- sapply(E(G),function(e) prod(V(G)[inc(e)]$prob))
# are they the same?
identical(E(G)$wt.1, E(G)$wt.2)
只有10个顶点和45个边,sapply(...)
大约快4倍;有100个顶点和〜5,000个边缘,它快6倍左右。