我注意到,如果我用相同的值更改图形中的所有边缘权重,则community.best_partition并不总是导致相同的社区。
我在所有情况下都使用了相同的随机状态,并且图完全相同,只是不是所有边缘权重都等于1,例如它们可能等于5。模块化的定义抵消了这个乘以邻接矩阵,当我阅读有关算法的文章时,我找不到应该改变结果的步骤。是否有造成这种差异的原因?
import networkx as nx
import community
from sklearn.metrics import adjusted_rand_score
def main():
g = nx.davis_southern_women_graph()
nodes = g.nodes()
clusters_init = community.best_partition(g, random_state=10)
print("modularity with initial clusters = %.15f" % community.modularity(clusters_init, g))
labels_init = [clusters_init[n] for n in nodes]
for num in range(1, 9):
for u, v in g.edges():
g[u][v]["weight"] = num
clusters = community.best_partition(g, random_state=10)
labels = [clusters[n] for n in nodes]
print("value of edge weight = %d," % num, "modularity = %.15f," % community.modularity(clusters, g),
"modularity with initial clusters = %.15f," % community.modularity(clusters_init, g),
"adjusted rand score = %.3f" % adjusted_rand_score(labels_pred=labels, labels_true=labels_init))
if __name__ == "__main__":
main()
具有初始簇的模数= 0.334869334679965
边缘权重的值= 1,模块化= 0.334869334679965,带有初始簇的模块化= 0.334869334679965,调整后的兰德评分= 1.000
边缘权重的值= 2,模块化= 0.334869334679965,带有初始簇的模块化= 0.334869334679965,调整后的兰德得分= 1.000
边缘权重的值= 3,模块化= 0.334869334679965,带有初始聚类的模块化= 0.334869334679965,调整后的兰德得分= 1.000
边缘权重的值= 4,模块化= 0.334869334679965,带有初始聚类的模块化= 0.334869334679965,调整后的兰德评分= 1.000
边缘权重的值= 5,模块化= 0.332470647645499,带有初始簇的模块化= 0.334869334679965,调整后的兰德得分= 0.676
边缘权重的值= 6,模块化= 0.334869334679965,带有初始簇的模块化= 0.334869334679965,调整后的兰德评分= 1.000
边缘权重的值= 7,模块化= 0.332470647645499,带有初始簇的模块化= 0.334869334679965,调整后的兰德得分= 0.676
边缘权重的值= 8,模块化= 0.334869334679965,带有初始簇的模块化= 0.334869334679965,调整后的兰德评分= 1.000