Python / NetworkX:按边缘频率的频率向边缘添加权重

时间:2017-04-26 20:45:43

标签: python graph networkx

我在networkx创建了一个MultiDiGraph,我试图在边缘添加权重,然后根据边缘出现的频率/计数分配一个新的权重。我使用以下代码创建图表并添加权重,但我不确定如何根据计数解决重新分配权重:

g = nx.MultiDiGraph()

df = pd.read_csv('G:\cluster_centroids.csv', delimiter=',')
df['pos'] = list(zip(df.longitude,df.latitude))
dict_pos = dict(zip(df.cluster_label,df.pos))
#print dict_pos


for row in csv.reader(open('G:\edges.csv', 'r')):
    if '[' in row[1]:       #
        g.add_edges_from(eval(row[1]))

for u, v, d in g.edges(data=True):
    d['weight'] = 1
for u,v,d in g.edges(data=True):
    print u,v,d

修改

我能够成功地为每个边缘分配权重,这是我原始问题的第一部分,具有以下内容:

for u, v, d in g.edges(data=True):
    d['weight'] = 1
for u,v,d in g.edges(data=True):
    print u,v,d

但是,我仍然无法根据边缘出现的次数重新分配权重(我的图形中的单个边缘可能多次出现)?我需要实现这一点,以便使具有更高计数的边可视化与具有更低计数(使用边缘颜色或宽度)的边不同。我不知道如何根据计数重新分配重量,请指教。以下是示例数据和指向我的完整数据集的链接。

数据

样本Centroids(节点):

cluster_label,latitude,longitude
0,39.18193382,-77.51885109
1,39.18,-77.27
2,39.17917928,-76.6688633
3,39.1782,-77.2617
4,39.1765,-77.1927
5,39.1762375,-76.8675441
6,39.17468,-76.8204499
7,39.17457332,-77.2807235
8,39.17406072,-77.274685
9,39.1731621,-77.2716502
10,39.17,-77.27

样品边缘:

user_id,edges
11011,"[[340, 269], [269, 340]]"
80973,"[[398, 279]]"
608473,"[[69, 28]]"
2139671,"[[382, 27], [27, 285]]"
3945641,"[[120, 422], [422, 217], [217, 340], [340, 340]]"
5820642,"[[458, 442]]"
6060732,"[[291, 431]]"
6912362,"[[68, 27]]"
7362602,"[[112, 269]]"

完整数据

质心(节点):https://drive.google.com/open?id=0B1lvsCnLWydEdldYc3FQTmdQMmc

边缘:https://drive.google.com/open?id=0B1lvsCnLWydEdEtfM2E3eXViYkk

更新

通过设置minLineWidth并将其乘以权重,我能够至少暂时解决由于边缘重量过大导致的过度不成比例的边缘宽度问题:

minLineWidth = 0.25

for u, v, d in g.edges(data=True):
    d['weight'] = c[u, v]*minLineWidth
edges,weights = zip(*nx.get_edge_attributes(g,'weight').items())

并在width=[d['weight'] for u,v, d in g.edges(data=True)]中使用nx.draw_networkx_edges(),如下面的解决方案所示。

此外,我还可以使用以下方法缩放颜色:

# Set Edge Color based on weight
values = range(7958) #this is based on the number of edges in the graph, use print len(g.edges()) to determine this
jet = cm = plt.get_cmap('YlOrRd')
cNorm  = colors.Normalize(vmin=0, vmax=values[-1])
scalarMap = cmx.ScalarMappable(norm=cNorm, cmap=jet)
colorList = []

for i in range(7958):
    colorVal = scalarMap.to_rgba(values[i])
    colorList.append(colorVal)

然后在edge_color=colorList中使用参数nx.draw_networkx_edges()

enter image description here

1 个答案:

答案 0 :(得分:4)

尝试使用此尺寸。

注意:我添加了现有边的副本,只是为了显示多图中重复时的行为。

from collections import Counter
c = Counter(g.edges())  # Contains frequencies of each directed edge.

for u, v, d in g.edges(data=True):
    d['weight'] = c[u, v]

print(list(g.edges(data=True)))
#[(340, 269, {'weight': 1}),
# (340, 340, {'weight': 1}),
# (269, 340, {'weight': 1}),
# (398, 279, {'weight': 1}),
# (69, 28, {'weight': 1}),
# (382, 27, {'weight': 1}),
# (27, 285, {'weight': 2}),
# (27, 285, {'weight': 2}),
# (120, 422, {'weight': 1}),
# (422, 217, {'weight': 1}),
# (217, 340, {'weight': 1}),
# (458, 442, {'weight': 1}),
# (291, 431, {'weight': 1}),
# (68, 27, {'weight': 1}),
# (112, 269, {'weight': 1})]

编辑:要使用边缘权重作为厚度显示图形,请使用以下命令:

nx.draw_networkx(g, width=[d['weight'] for _, _, d in g.edges(data=True)])