所以我有以下数据:
SELECT a.id,
a.first,
a.last,
COUNT( DISTINCT ac.id) as 'client count',
max(l.ts) AS 'lastlogin'
FROM agents a
LEFT JOIN agentclients ac on a.id = ac.agentid
LEFT JOIN loginhistory l on l.userid = a.aid and l.usertype = 'A'
WHERE a.agentdeleted = 0
GROUP BY ac.agentid;
第一列是node_from,第二列是node_to,第三列是权重(默认为1),最后一列是时间戳。
我的问题是如何根据2个节点之间的链接数计算权重。例如行
9 22 1 1082418256
5 21 1 1082434689
26 7 1 1082448725
27 28 1 1082457840
29 25 1 1082471683
30 31 1 1082485106
30 31 1 1082485111
30 31 1 1082485113
30 31 1 1082485116
32 33 1 1082485623
34 35 1 1082493130
应该具有权重4,因为这两个节点之间已经有4次连接。
提前谢谢! This是具有以下文件的网络链接:
答案 0 :(得分:1)
您可以渐进地构建图形,而只需将权重添加到边缘,例如:
In []
import networkx as nx
G = nx.Graph()
with open(<file>) as file:
for line in file:
e1, e2, weight, timestamp = line.strip().split()
G.add_edge(e1, e2)
G[e1][e2]['weight'] = G[e1][e2].get('weight', 0) + int(weight)
nx.to_dict_of_dicts(G)
Out[]:
{'9': {'22': {'weight': 1}},
'22': {'9': {'weight': 1}},
'5': {'21': {'weight': 1}},
'21': {'5': {'weight': 1}},
'26': {'7': {'weight': 1}},
'7': {'26': {'weight': 1}},
'27': {'28': {'weight': 1}},
'28': {'27': {'weight': 1}},
'29': {'25': {'weight': 1}},
'25': {'29': {'weight': 1}},
'30': {'31': {'weight': 4}},
'31': {'30': {'weight': 4}},
'32': {'33': {'weight': 1}},
'33': {'32': {'weight': 1}},
'34': {'35': {'weight': 1}},
'35': {'34': {'weight': 1}}}
如果您愿意使用其他库,则可以在pandas
中创建边列表并转换为图形:
import pandas as pd
cols = ['source', 'target', 'weight', 'timestamp']
with open(<file>) as file:
df = pd.read_csv(file, sep=' ', header=None, names=cols).drop('timestamp', axis=1)
G = nx.from_pandas_edgelist(df.groupby([df.source, df.target]).sum().reset_index(), edge_attr=True)