Question

所以我有以下数据：

SELECT a.id, 
       a.first, 
       a.last, 
       COUNT( DISTINCT ac.id) as 'client count',
       max(l.ts) AS 'lastlogin'
FROM agents a
LEFT JOIN agentclients ac on a.id = ac.agentid
LEFT JOIN loginhistory l on l.userid = a.aid and l.usertype = 'A'
WHERE a.agentdeleted = 0
GROUP BY ac.agentid;

第一列是node_from，第二列是node_to，第三列是权重（默认为1），最后一列是时间戳。

我的问题是如何根据2个节点之间的链接数计算权重。例如行

  9 22 1 1082418256   
  5 21 1 1082434689  
  26 7 1 1082448725  
  27 28 1 1082457840  
  29 25 1 1082471683  
  30 31 1 1082485106  
  30 31 1 1082485111  
  30 31 1 1082485113  
  30 31 1 1082485116  
  32 33 1 1082485623  
  34 35 1 1082493130

应该具有权重4，因为这两个节点之间已经有4次连接。

提前谢谢！ This是具有以下文件的网络链接：

Answer 1

您可以渐进地构建图形，而只需将权重添加到边缘，例如：

In []
import networkx as nx

G = nx.Graph()
with open(<file>) as file:
    for line in file:
        e1, e2, weight, timestamp = line.strip().split()
        G.add_edge(e1, e2)
        G[e1][e2]['weight'] = G[e1][e2].get('weight', 0) + int(weight)

nx.to_dict_of_dicts(G)

Out[]:
{'9': {'22': {'weight': 1}},
 '22': {'9': {'weight': 1}},
 '5': {'21': {'weight': 1}},
 '21': {'5': {'weight': 1}},
 '26': {'7': {'weight': 1}},
 '7': {'26': {'weight': 1}},
 '27': {'28': {'weight': 1}},
 '28': {'27': {'weight': 1}},
 '29': {'25': {'weight': 1}},
 '25': {'29': {'weight': 1}},
 '30': {'31': {'weight': 4}},
 '31': {'30': {'weight': 4}},
 '32': {'33': {'weight': 1}},
 '33': {'32': {'weight': 1}},
 '34': {'35': {'weight': 1}},
 '35': {'34': {'weight': 1}}}

如果您愿意使用其他库，则可以在pandas中创建边列表并转换为图形：

import pandas as pd

cols = ['source', 'target', 'weight', 'timestamp']
with open(<file>) as file:
    df = pd.read_csv(file, sep=' ', header=None, names=cols).drop('timestamp', axis=1)
G = nx.from_pandas_edgelist(df.groupby([df.source, df.target]).sum().reset_index(),  edge_attr=True)

Networkx Python计算权重

1 个答案: