我想在没有NetworkX的内置方法的情况下计算网络的聚类系数。
我现在写的方法似乎正是
import networkx as nx
import matplotlib.pyplot as plt
%matplotlib inline
#create graph
G=nx.Graph()
G.add_edges_from([(0,1),(0,2),(0,4),(0,3),(0,5),(1,7),(1,10),(1,11),(1,12),(2,4),(2,5),(2,3),(3,4),(5,8),(5,6),(6,8),(6,9),(6,7),(7,9),(7,10),(10,11),(10,12),(11,13),(12,13)])
# print 1 test value
print nx.clustering(G,1)
def clustering_coefficient(G):
# this will store the mapping of node/coefficient
clusteringDict = {}
for node in G:
neighboursOfNode = []
nodesWithMutualFriends = []
# store all neighbors of the node in an array so we can compare
for neighbour in G.neighbors(node):
neighboursOfNode.append(neighbour)
for neighbour in G.neighbors(node):
for second_layer_neighbour in G.neighbors(neighbour):
# compare if any second degree neighbour is also a first degree neighbour (this makes a triangle)
# if so, append it to the mutual friends list
if second_layer_neighbour in neighboursOfNode:
nodesWithMutualFriends.append(second_layer_neighbour)
# filter duplicates from the mutual friend array
nodesWithMutualFriends = list(set(nodesWithMutualFriends))
clusteringCoefficientOfNode = 0
# apply coefficient formula to calculate
if len(nodesWithMutualFriends):
clusteringCoefficientOfNode = (2 * float(len(nodesWithMutualFriends)))/((float(len(G.neighbors(node))) * (float(len(G.neighbors(node))) - 1)))
clusteringDict[node] = clusteringCoefficientOfNode
clustering_coefficient(G)
但是,在运行此脚本时,NetworkX值将在大多数情况下提供与我自己的脚本不同的值。不知何故,这个脚本也可以运行高达2.0而不是1.0。
我的逻辑出了什么问题?
答案 0 :(得分:0)
至少有一个问题来自以下方面:
clusteringCoefficientOfNode = (2 * float(len(nodesWithMutualFriends)))/((float(len(G.neighbors(node))) * (float(len(G.neighbors(node))) - 1)))
如果节点1有N个邻居,所有邻居都是彼此的邻居,那么每个邻居只会出现nodeWithMutualFriends
一次 - 因为你使用了set
,尽管它是N-1个三角形。然后乘以2,所以你得到2N /(N *(N-1))= 2 /(N-1)。但你应该有1.所以你真的没有计算三角形的数量。您正在计算三角形中的节点数。然后你要除以可能的三角形数量。
因此,您可以通过移除set
来电并删除2*
来解决此问题。