我正在寻找一种有效的方法来查找大图中所有节点的精确度的邻域。即使它将图形存储为稀疏矩阵,igraph::ego
也会爆炸:
require(Matrix)
require(igraph)
require(ggplot2)
N <- 10^(1:5)
runtimes <- function(N) {
g <- erdos.renyi.game(N, 1/N)
system.time(ego(g, 2, mindist = 2))[3]
}
runtime <- sapply(N, runtimes)
qplot(log10(N), runtime, geom = "line")
有更有效的方法吗?
答案 0 :(得分:0)
直接使用邻接矩阵可以显着改善。
# sparse adjacency-matrix calculation of indirect neighbors -------------------
diff_sparse_mat <- function(A, B) {
# Difference between sparse matrices.
# Input: sparse matrices A and B
# Output: C = (A & !B), using element-wise diffing, treating B as logical
stopifnot(identical(dim(A), dim(B)))
A <- as(A, "generalMatrix")
AT <- as.data.table(summary(as(A, "TsparseMatrix")))
setkeyv(AT, c("i", "j"))
B <- drop0(B)
B <- as(B, "generalMatrix")
BT <- as.data.table(summary(as(B, "TsparseMatrix")))
setkeyv(BT, c("i", "j"))
C <- AT[!BT]
if (length(C) == 2) {
return(sparseMatrix(i = C$i, j = C$j, dims = dim(A)))
} else {
return(sparseMatrix(i = C$i, j = C$j, x = C$x, dims = dim(A)))
}
}
distance2_peers <- function(adj_mat) {
# Returns a matrix of indirect neighbors, excluding the diagonal
# Input: adjacency matrix A (assumed symmetric)
# Output: (A %*% A & !A) with zero diagonal
indirect <- forceSymmetric(adj_mat %*% adj_mat)
indirect <- diff_sparse_mat(indirect, adj_mat) # excl. direct neighbors
indirect <- diff_sparse_mat(indirect, Diagonal(n = dim(indirect)[1])) # excl. diag.
return(indirect)
}
对于鄂尔多斯仁义的例子,在半分钟内现在可以分析10 ^ 7,而不是10 ^ 5的网络:
N <- 10 ^ (1:7)
runtimes <- function(N) {
g <- erdos.renyi.game(N, 1 / N, directed = FALSE)
system.time(distance2_peers(as_adjacency_matrix(g)))[3]
}
runtime <- sapply(N, runtimes)
qplot(log10(N), runtime, geom = "line")
结果矩阵包含(i,j)从i到j的长度为2的路径数(不包括i本身的路径)。