Python NetworkX MatplotLib DiGraph查找路径长于2的所有树

时间:2012-12-31 06:22:46

标签: python tree matplotlib networkx

我在csv文件中有类似以下的数据:

a,b,50
b,c,60
b,e,25
e,f,20
z,n,10
x,m,25
v,p,15

我正在尝试使用NetworkX和Matplotlib来绘制数据图形,但是我的csv有很多行/节点可以使图形无任何意义。

这是我用来绘制代码的重要部分:

import networkx as nx
import matplotlib.pyplot as plt

G = nx.DiGraph()

f = open("test_data.csv", "r")

for line in f:
    node1, node2, weight1 = line.split(",")
    G.add_edge(node1, node2)

nx.draw(G)
plt.show()

对于此样本数据,我最终得到以下图表:

sample_plot

从这个小样本中可以很容易地看到一些节点([z,n],[x,m] [v,p])是只有两个节点的树。我想检测并消除这些,因为我只关心超过两个节点的树。我确定有很多方法可以做,有人可以提出建议或举例吗?

3 个答案:

答案 0 :(得分:1)

对于您的特定情况,尝试迭代边缘列表并查询图形是否目标节点本身具有邻居(例如,目标是另一边缘中的源)。如果目标不包含其他邻居,则满足您的标准。

代码:

for src, trg in G.edges():
    if G.neighbors(trg) == []:
        G.remove_edge(*(src,trg)) # Need the * to unpack the edge nodes

G应该只包含边(a,b),(b,e):(见下面的ipython输出)

In [35]: G.edges()
Out[35]: [('a', 'b'), ('b', 'e')]

祝你好运!

答案 1 :(得分:1)

我不知道nx API所以我不会使用G digraph对象解决这个问题,而只使用dict

import networkx as nx
import matplotlib.pyplot as plt

G = nx.DiGraph()

f = open("test_data.csv", "r")

blocs_by_node = {}
for line in f:
    node1, node2, weight1 = line.split(",")
    if node1 not in blocs_by_node and node2 not in blocs_by_node :
        bloc = [node1, node2]
        blocs_by_node[node1] = bloc
        blocs_by_node[node2] = bloc
    elif node1 not in blocs_by_node and node2 in blocs_by_node :
        bloc = blocs_by_node[node2]
        bloc.append(node1)
        blocs_by_node[node1] = bloc
    elif node1 in blocs_by_node and node2 not in blocs_by_node :
        bloc = blocs_by_node[node1]
        bloc.append(node2)
        blocs_by_node[node2] = bloc
    elif blocs_by_node[node1] is not blocs_by_node[node2] :
        bloc = blocs_by_node[node1]
        for node in blocs_by_node[node2] :
            bloc.append(node)
            blocs_by_node[node] = bloc

f.close()

f = open("test_data.csv", "r")

for line in f:
    node1, node2, weight1 = line.split(",")
    if len(blocs_by_node[node1]) > 2 :
        G.add_edge(node1, node2)

f.close()

nx.draw(G)
plt.show()

我读了两次文件,你可以通过将值存储在列表中来重构代码以读取它。

顺便说一句,我希望该示例的解决方案包含:

[('a', 'b'), ('b', 'c'), ('b', 'e'), ('e', 'f')]

答案 2 :(得分:1)

您可以使用networkX bellman_ford方法查找超过给定最小值的路径。 为此,您需要一个权重设置为-1的有向图(或图形)G。

以下代码基于this thread

import networkx as nx
import matplotlib.pyplot as plt

data = (('a','b',50), ('b','c',60), ('b','e',25),
        ('e','f',20), ('z','n',10), ('x','m',25),
        ('v','p',15))

G = nx.DiGraph()
for node1, node2, weight1 in data:
    G.add_edge(node1, node2, weight=-1)

min_lenght = 2
F = nx.DiGraph()   #filtered graphs

# check all edges with bellman_ford
for u, v in G.edges():
    vals, distances = nx.bellman_ford(G, u)
    if min(distances.values()) < - min_lenght:
        for u, v in vals.items():
            if v:
                F.add_edge(v, u)

nx.draw(F)
plt.show()

这会生成唯一符合要求的图表:

enter image description here

请注意,这是用于确定具有最长路径(就距离而言)的图形的一般方法的简化。因此,如果您创建包含权重的图表,则可以在更改权重符号后应用bellman ford:

G = nx.DiGraph()
for node1, node2, weight1 in data:
    G.add_edge(node1, node2, weight=weight1)

min_lenght = 100  

H = nx.DiGraph(G)  # intermediate graph
# change sign of weights
for u, v in H.edges():
    H[u][v]['weight'] *= -1

# check all edges with bellman_ford
for u, v in G.edges():
    vals, distances = nx.bellman_ford(H, u)
    if min(distances.values()) < - min_lenght:
        #--- whatever ----