计算聚类系数

时间:2015-11-13 02:12:34

标签: python network-programming networkx

我想在没有NetworkX的内置方法的情况下计算网络的聚类系数。

我现在写的方法似乎正是

import networkx as nx
import matplotlib.pyplot as plt
%matplotlib inline

#create graph
G=nx.Graph()
G.add_edges_from([(0,1),(0,2),(0,4),(0,3),(0,5),(1,7),(1,10),(1,11),(1,12),(2,4),(2,5),(2,3),(3,4),(5,8),(5,6),(6,8),(6,9),(6,7),(7,9),(7,10),(10,11),(10,12),(11,13),(12,13)])

# print 1 test value
print nx.clustering(G,1)

def clustering_coefficient(G):
    # this will store the mapping of node/coefficient
    clusteringDict = {}
    for node in G:

        neighboursOfNode = []
        nodesWithMutualFriends = []

        # store all neighbors of the node in an array so we can compare
        for neighbour in G.neighbors(node):
            neighboursOfNode.append(neighbour)

        for neighbour in G.neighbors(node):
            for second_layer_neighbour in G.neighbors(neighbour):
                # compare if any second degree neighbour is also a first degree neighbour (this makes a triangle)
                # if so, append it to the mutual friends list
                if second_layer_neighbour in neighboursOfNode:
                    nodesWithMutualFriends.append(second_layer_neighbour)

        # filter duplicates from the mutual friend array
        nodesWithMutualFriends = list(set(nodesWithMutualFriends))

        clusteringCoefficientOfNode = 0
        # apply coefficient formula to calculate
        if len(nodesWithMutualFriends):
            clusteringCoefficientOfNode =  (2 * float(len(nodesWithMutualFriends)))/((float(len(G.neighbors(node))) * (float(len(G.neighbors(node))) - 1)))

        clusteringDict[node] = clusteringCoefficientOfNode

clustering_coefficient(G)

但是,在运行此脚本时,NetworkX值将在大多数情况下提供与我自己的脚本不同的值。不知何故,这个脚本也可以运行高达2.0而不是1.0。

我的逻辑出了什么问题?

1 个答案:

答案 0 :(得分:0)

至少有一个问题来自以下方面:

 clusteringCoefficientOfNode =  (2 * float(len(nodesWithMutualFriends)))/((float(len(G.neighbors(node))) * (float(len(G.neighbors(node))) - 1)))

如果节点1有N个邻居,所有邻居都是彼此的邻居,那么每个邻居只会出现nodeWithMutualFriends一次 - 因为你使用了set,尽管它是N-1个三角形。然后乘以2,所以你得到2N /(N *(N-1))= 2 /(N-1)。但你应该有1.所以你真的没有计算三角形的数量。您正在计算三角形中的节点数。然后你要除以可能的三角形数量。

因此,您可以通过移除set来电并删除2*来解决此问题。