Question

我正在进行图表分析。我想计算N×N相似度矩阵，其包含每两个顶点之间的Adamic Adar相似性。为了概述Adamic Adar，让我从这个介绍开始：

给定无向图A的邻接矩阵G。 CN是两个顶点x，y的所有常见邻居的集合。两个顶点的公共邻居是两个顶点具有边缘/链接的顶点，即两个顶点对于A中的对应公共邻居节点将具有1。 k_n是节点n的度数。

Adamic-Adar定义如下： enter image description here

我尝试计算它是从x获取y和A个节点的两行，然后对它们求和。然后查找具有2作为值的元素，然后获取它们的度数并应用等式。然而，计算需要花费很长时间。我尝试使用包含1032个顶点的图形，并且花费了大量时间进行计算。它从7分钟开始，然后我取消了计算。所以我的问题是：有更好的计算算法吗？

这是我在python中的代码：

def aa(graph):

"""
    Calculates the Adamic-Adar index.

"""
N = graph.num_vertices()
A = gts.adjacency(graph)
S = np.zeros((N,N))
degrees = get_degrees_dic(graph)
for i in xrange(N):
    A_i = A[i]
    for j in xrange(N):
        if j != i:
            A_j = A[j]
            intersection = A_i + A_j
            common_ns_degs = list()
            for index in xrange(N):
                if intersection[index] == 2:
                    cn_deg = degrees[index]
                    common_ns_degs.append(1.0/np.log10(cn_deg))
            S[i,j] = np.sum(common_ns_degs)
return S

Answer 1

我相信你使用的是相当缓慢的方法。最好还原它 -
   - 用零初始化AA（Adamic-Adar）矩阵    - 对于每个节点k得到它的度k_deg
   - calc d = log(1.0/k_deg)（为什么log10 - 重要与否？）
   - 将d添加到所有AA _ij，其中i，j - k _th行中的所有1对邻接矩阵的修改
   - 对于稀疏图，将k _th行中所有1的位置提取到列表以达到O（V *（V + E））复杂度而不是O（V ^ 3）

AA = np.zeros((N,N)) for k = 0 to N - 1 do AdjList = [] for j = 0 to N - 1 do if A[k, j] = 1 then AdjList.Add(j) k_deg = AdjList.Length d = log(1/k_deg) for j = 0 to AdjList.Length - 2 do for i = j+1 to AdjList.Length - 1 do AA[AdjList[i],AdjList[j]] = AA[AdjList[i],AdjList[j]] + d //half of matrix filled, it is symmetric for undirected graph

Answer 2

由于你正在使用numpy，你可以真正减少迭代算法中每个操作的需要。我的numpy-和vectorized-fu不是最好的，但是下面在大约2.5s的情况下运行~13,000个节点：

def adar_adamic(adj_mat):    
    """Computes Adar-Adamic similarity matrix for an adjacency matrix"""

    Adar_Adamic = np.zeros(adj_mat.shape)
    for i in adj_mat:
        AdjList = i.nonzero()[0] #column indices with nonzero values
        k_deg = len(AdjList)
        d = np.log(1.0/k_deg) # row i's AA score

        #add i's score to the neighbor's entry
        for i in xrange(len(AdjList)):
            for j in xrange(len(AdjList)):
                if AdjList[i] != AdjList[j]:
                    cell = (AdjList[i],AdjList[j])
                    Adar_Adamic[cell] = Adar_Adamic[cell] + d

    return Adar_Adamic

与MBo的答案不同，这确实构建了完整的对称矩阵，但考虑到执行时间，效率低下（对我而言）是可以容忍的。

Answer 3

我没有看到减少时间复杂度的方法，但它可以被矢量化：

degrees = A.sum(axis=0)
weights = np.log10(1.0/degrees)
adamic_adar = (A*weights).dot(A.T)

使用A常规Numpy数组。您似乎正在使用graph_tool.spectral.adjacency，因此A将是一个稀疏矩阵。在这种情况下，代码将是：

from scipy.sparse import csr_matrix

degrees = A.sum(axis=0)
weights = csr_matrix(np.log10(1.0/degrees))
adamic_adar = A.multiply(weights) * A.T

这比使用Python循环要快得多。但是有一个小警告：使用这种方法，您确实需要确保主对角线（A和adamic_adar）上的值符合您的预期。此外，A不得包含权重，但只能包含0和1。

Answer 4

我认为大多数功能都类似于R中igraph python_igraph中定义的one以及节点相似度（Adamic_Adar）

计算Adamic-Adar的快速算法

4 个答案: