循环遍历networkx中的已连接组件,并提取包含某些节点的组件

时间:2014-09-04 01:45:40

标签: python for-loop networkx

我有一个非常大的无向网络加载到NetworkX graph()中,该网络由许多断开连接的组件组成。我还有一组感兴趣的节点加载到一个集合中。我想查看所有提取的所有组件至少包含一个感兴趣的节点。

# create empty graph
g = nx.Graph()

# add edges to the graph
g.add_edges_from([['a','b'],['a','c'],['b','c'],['d','e'],['e','f'],['d','f'],['g','h'],['g','i'],['h','i']])

# load nodes of interest into a set
interest_nodes = set(['a', 'b', 'f'])

# number of connected components
nx.number_connected_components(g)

# loop through each connected component and add all of the edges for that component to a list if a node in that component is found in the interest_nodes
interest_edges = []
for i in nx.connected_component_subgraph(g):
    for u in i.edges():
        if u in interest_nodes:
            interest_edges.append(u)

但是,我得到一个空列表。

理想情况下,我希望返回一个列表,其中包含interest_nodes集中至少包含一个节点的任何连通组件中的所有边。我应该在下面找到什么,但相反,我没有得到任何回报。

interest_edges = [('a', 'c'),
                  ('a', 'b'),
                  ('c', 'b'),
                  ('e', 'd'),
                  ('e', 'f'),
                  ('d', 'f')]

2 个答案:

答案 0 :(得分:3)

你很亲密。最简单的方法是通过检查集合交集的长度来检查每个组件以查看节点是否设置重叠。

import networkx as nx

g = nx.Graph([['a','b'],['a','c'],['b','c'],['d','e'],['e','f'],['d','f'],['g','h'],['g','i'],['h','i']])

interest_nodes = set(['a', 'b', 'f'])

interest_edges = []
for component in nx.connected_component_subgraphs(g):
    if len(set(component) & interest_nodes) > 0:
        interest_edges.extend(component.edges())

print interest_edges
# output
# [('a', 'c'), ('a', 'b'), ('c', 'b'), ('e', 'd'), ('e', 'f'), ('d', 'f')]

答案 1 :(得分:0)

connected_component_subgraph()函数没有像我期望的那样工作。作为一种解决方法,您可以遍历所有连接的组件,并将所有兴趣节点和连接的节点添加到新的感兴趣的列表中。

然后循环你的边缘。

interest_nodes = set(['a', 'b', 'f'])

interest_nodes_plus_connected = []
for c in nx.connected_components(g):
    for n in interest_nodes:    
        if n in c:
            for node in c:
                interest_nodes_plus_connected.append(node)

interest_nodes_plus_connected = set(interest_nodes_plus_connected)

interest_edges = []
for e in g.edges():
    for n in interest_nodes_plus_connected:
        if n in str(e[0]) or n in str(e[1]):
            interest_edges.append(e)
for ie in interest_edges:
    print ie