我想解析一大堆DiGraph数据并对其进行一些逻辑测试。数据如下:
Source,Target,DateTime
a,b,201212100401
a,d,201212100403
b,e,201212100511
b,c,201212100518
e,f,201212100610
c,a,201212100720
DataTime标记为YYYYMMDDhhmm ...
我有一些我正在寻找的逻辑,比如find:实例A和C说话但不是之前(A和B)(B和C)。因此打印出来就像:
Time 1| Time 2| Time 3
a,b,201212100401| b,c,201212100518| c,a,201212100720
我认为我可以将这些视为具有以下内容的网络对象:
import networkx as nx
import sys
G = nx.DiGraph()
for line in (open(sys.argv[1])):
n1, n2, t1 = line.split(',')
G.add_edge(n1, n2, time=t1)
现在数据存储在G中,我不知道如何查询A,B然后B,C然后C,A关系。
有人有任何建议吗?
答案 0 :(得分:1)
这是一种方法:
import networkx as nx
data = '''a,b,201212100401
a,d,201212100403
b,e,201212100511
b,c,201212100518
e,f,201212100610
c,a,201212100720'''.split('\n')
G = nx.DiGraph()
for line in data:
n1, n2, t1 = line.split(',')
G.add_edge(n1, n2, time=t1)
def check_sequence(list_of_edges):
times = []
# First check if all the edges are in the graph
# and collect their times in a list
for e in list_of_edges:
if e in G.edges():
times.append(G[e[0]][e[1]]['time'])
else:
return "Edge {} not in the graph.".format(str(e))
# Next check if each successive time in the list
# is greater than the previous time
start = times[0]
for time in times[1:]:
if time > start:
start = time
else:
return 'Edges not in sequence: {}'.format(str(times))
# If we have not returned up to now, then we are in sequence
return 'Edges are in sequence: {}'.format(str(times))
print check_sequence( [('a', 'e'), ('e', 'f'), ('a', 'f') ] )
# Edge ('a', 'e') not in the graph.
print check_sequence( [('a', 'b'), ('b', 'c'), ('c', 'a') ] )
# Edges are in sequence: ['201212100401', '201212100518', '201212100720']
print check_sequence( [('c', 'a'), ('a', 'b'), ('b', 'c') ] )
# Edges not in sequence: ['201212100720', '201212100401', '201212100518']