Question

我正在研究在线示例，病毒通过图表传播。该示例足以用于小图，即少量边和节点。但是我在非常大的图形上尝试了它，即10000个节点和20000个边缘，但是下面的函数对于大图是不够的，因为它很慢。

我的问题是如何将以下函数转换为矢量化代码可以针对大型图进行优化？

spreadVirus <- function(G,Vinitial,Activation_probability){  

# Precompute all outgoing graph adjacencies

G$AdjList = get.adjlist(G,mode="out")

# Initialize various graph attributes
V(G)$color    = "blue"
E(G)$color    = "black"
V(G)[Vinitial]$color    <- "yellow"

# List to store the incremental graphs (for plotting later)
Glist <- list(G)
count <- 1

# Spread the infection
active <- Vinitial

while(length(active)>0){
new_infected <- NULL
E(G)$color = "black"

for(v in active){
# spread through the daily contacts of vertex v

daily_contacts <- G$AdjList[[v]]

E(G)[v %->% daily_contacts]$color <- "red"

for(v1 in daily_contacts){

if(V(G)[v1]$color == "blue" & new_color=="red") { 

V(G)[v1]$color <- "red"

new_infected <- c(new_infected,v1)

 } 
}
}
# the next active set
#this needed for updating

active <- new_infected

# Add graph to list
# optional dependening on if i want to graph 
count <- count + 1
Glist[[count]] <- G
}
return(Glist)
}

我的问题是如何针对大图优化以下功能？

谢谢穆纳

Answer 1

我对R的内存管理知之甚少，但我认为在较大图形情况下缓慢的主要原因是在每个循环周期中复制图形对象，分配一大块记忆。这些甚至不是那么大的图：igraph快乐地运行着数百万个节点/边缘的图形。您可以考虑只保留一个图形对象，并在每个步骤中创建新的顶点属性：

spreadVirus <- function(G,Vinitial,Activation_probability){  
    # Precompute all outgoing graph adjacencies
    G$AdjList <- get.adjlist(G, mode = "out")

    # Initialize various graph attributes
    V(G)$step0    <- "blue"
    E(G)$color    <- "black"
    V(G)[Vinitial]$color    <- "yellow"

    # Spread the infection
    active <- Vinitial

    step <- 0
    while(length(active)>0){
        step <- step + 1
        new_infected <- NULL
        E(G)$color <- "black"
        vertex.attributes(g)[[sprintf('step%d', step)]] <- 
        vertex.attributes(g)[[sprintf('step%d', step - 1)]]

        for(v in active){
            # spread through the daily contacts of vertex v

            daily_contacts <- G$AdjList[[v]]

            E(G)[v %->% daily_contacts]$color <- "red"

            for(v1 in daily_contacts){

                vertex.attributes(g)[[sprintf('step%d', step)]][v1] <- "red"
                new_infected <- c(new_infected, v1)

                } 
            }
        }
    # the next active set
    #this needed for updating

    active <- new_infected

    # Add graph to list
    # optional dependening on if i want to graph 
    return(G)
}

之后，您可以参考V(G)$step#属性来获取每个时间点的受感染节点。对于绘图，您可以传递vertex.color = V(G)$step#。您可以通过induced.subgraph(G, which(V(G)$step#=='red')获取图表的受感染部分。

您可以使用边缘颜色属性进行类似操作。

边缘数量大时的性能问题

1 个答案: