在复杂的基因网络中,我们如何找到拓扑重叠。
输入数据如下
code code weight
3423 3455 3453
2344 2353 45
3432 3453 456
3235 4566 34532
2345 8687 356
2466 6467 3567
3423 2344 564
3455 2353 4564
3432 3423 456
节点列为col [0]和col [1],连接所花费的时间为col [2]
代码:
import networkx as nx
import numpy as np
data = np.loadtxt("USC_Test.txt")
col = []
edge_list = zip[col[0],col[1]]
G = nx.Graph()
G.add_edges_from(edge_list)
components = nx.connected_components(G)
print components
错误
edge_list = zip[col[0],col[1]]
IndexError: list index out of range
答案 0 :(得分:3)
我必须承认我对拓扑重叠一词并不熟悉,所以我不得不查阅:
如果网络中的一对节点都强烈连接到同一组节点,则称其具有高拓扑重叠。 (Source)
NetworkX似乎没有内置方法,可以让您找到具有拓扑重叠的节点对,但它可以轻松找到强连接组件。例如:
In [1]: import networkx as nx
In [2]: edge_list = [(1, 2), (2, 1), (3, 1), (1, 3), (2, 4), (1, 4), (5, 6)]
In [3]: G = nx.DiGraph()
In [4]: G.add_edges_from(edge_list)
In [5]: components = nx.strongly_connected_components(G)
In [6]: components
Out[6]: [[1, 3, 2], [4], [6], [5]]
如果您有无向图,则可以使用nx.connected_components
代替。
现在你有了组件,很容易找到具有toplogical重叠的所有对的列表。例如,从components
:
In [7]: from itertools import combinations
In [8]: top_overlap = [list(combinations(c, 2)) for c in components if len(c) > 1]
In [9]: top_overlap = [item for sublist in top_overlap for item in sublist]
In [10]: top_overlap
Out[10]: [(1, 3), (1, 2), (3, 2)]