NetworkX 平均最短路径长度和直径永远占用

时间:2021-05-12 00:57:26

标签: python-3.x graph networkx

我有一个由未加权边构建的图 (A),我想计算主图 (A) 中最大连通图 (giantC) 的平均最短路径长度。然而,该脚本到目前为止已经运行了 3 个多小时(在 Colab 和本地尝试过),无论是 diameter 还是 average_shortest_path_length 都没有输出结果。

我正在使用 networkx==2.5python==3.6.9

这是我的脚本

import logging
import networkx as nx 
from networkx.algorithms.distance_measures import diameter
from networkx.algorithms.shortest_paths.generic import average_shortest_path_length


# graph is built from a json file as follows 
with open('graph.json') as f:
     graph_dict = json.load(f)

_indices = graph_dict['indices']
s_lst, rs_lst= _indices[0], _indices[1]    

graph_ = nx.Graph()
for i in range(len(s_lst)):
     graph_.add_edge(s_lst[i], rs_lst[i])


# fetch the hugest graph of all graphs
connected_subgraphs = [graph_.subgraph(cc) for cc in 
nx.connected_components(graph_)]
logging.info('connected subgraphs fetched.')
Gcc = max(nx.connected_components(graph_), key=len)
giantC = graph_.subgraph(Gcc)
logging.info('Fetched Giant Subgraph')

n_nodes = giantC.number_of_nodes()
print(f'Number of nodes: {n_nodes}') # output is 106088

avg_shortest_path = average_shortest_path_length(giantC)
print(f'Avg Shortest path len: {avg_shortest_path}')

dia = diameter(giantC)
print(f'Diameter: {dia}')

有没有办法让它更快?或者替代计算巨C图的直径和最短路径长度?

1 个答案:

答案 0 :(得分:0)

对于未来的读者, 如果你想从你的 NetworkX Graph 中获取最大的连接子图

import networkx as nx
import logging


def fetch_hugest_subgraph(graph_):
    Gcc = max(nx.connected_components(graph_), key=len)
    giantC = graph_.subgraph(Gcc)
    logging.info('Fetched Giant Subgraph')
    return giantC

如果您想计算图形的平均最短路径长度,我们可以通过采样来实现

from statistics import mean
import networkx as nx


def write_nodes_number_and_shortest_paths(graph_, n_samples=10_000,
                                          output_path='graph_info_output.txt'):
    with open(output_path, encoding='utf-8', mode='w+') as f:
        for component in nx.connected_components(graph_):
            component_ = graph_.subgraph(component)
            nodes = component_.nodes()
            lengths = []
            for _ in range(n_samples):
                n1, n2 = random.choices(list(nodes), k=2)
                length = nx.shortest_path_length(component_, source=n1, target=n2)
                lengths.append(length)
            f.write(f'Nodes num: {len(nodes)}, shortest path mean: {mean(lengths)} \n')

我从 Joris Kinable(在评论中)得知的计算 avg_shortest_path_length 具有 O(V^3); V = number of nodes 的复杂性。这同样适用于计算你的 grah 的直径

相关问题