我试图在Excel中迭代一个列表(见下文),其中包含源ID和目标ID值。当以图形方式表示下面的列表时,会得到一个树状结构,其中每个点都连接到另一个点。有一个分裂事件,之后出现一个点两个目标点。我想找到一种方法从该列表中收集连接的源和目标ID。
有没有人知道如何处理这个问题,或者可以给我一些关于可能的解决方案的提示?
SPOT_SOURCE_ID SPOT_TARGET_ID
127466 127460
127460 127450
127450 127474
127450 127442
127474 127481
127442 127432
127481 127487
127432 127426
127426 127420
127487 127498
127420 127410
127498 127510
127510 127516
127410 127402
127516 127530
127530 127542
127402 127390
127542 127554
127390 127383
127554 127560
答案 0 :(得分:1)
您的对列表可以被视为有根有根tree的边缘。查找所有路径的标准方法是执行pre-order depth-first search。我们可以通过调整来恢复您想要的曲目。下面的代码包含一个生成完整路径的递归生成器tree_paths
以及一个生成轨道的递归生成器tree_tracks
。 tree_tracks
返回一个整数depth
以及每个曲目列表,depth
用于以结构化方式打印曲目。
data = '''\
127466 127460
127460 127450
127450 127474
127450 127442
127474 127481
127442 127432
127481 127487
127432 127426
127426 127420
127487 127498
127420 127410
127498 127510
127510 127516
127410 127402
127516 127530
127530 127542
127402 127390
127542 127554
127390 127383
127554 127560
'''.splitlines()
# Convert multiline `data` string to a list of (parent, child) tuples
edges = [tuple(int(u) for u in row.split()) for row in data]
# Each node in a tree can only have one parent. `v` is the parent of `k`
parents = {k: v for v, k in edges}
# Each value in `children` is a list containing the children of the key
children = {}
for u, v in edges:
children.setdefault(u, []).append(v)
# Recursively generate every path in the tree starting at `node`
# by performing a depth-first search
def tree_paths(node, head):
newhead = head + [node]
if node not in children:
yield newhead
return
descendants = children[node]
for n in descendants:
yield from tree_paths(n, newhead)
# Recursively generate every track in the tree starting at `node`
# by performing a depth-first search
def tree_tracks(node, head, depth=0):
newhead = head + [node]
if node not in children:
yield newhead, depth
return
descendants = children[node]
if len(descendants) > 1:
yield newhead, depth
newhead = []
depth += 1
for n in descendants:
yield from tree_tracks(n, newhead, depth)
# Find the root node.
# Start at any node. If the edges are sorted, `edges[0][0]` will be the root.
k = edges[0][0]
# Loop until we find a node without a parent.
# That node must be the root of the tree
while k in parents:
k = parents[k]
root = k
print('Paths')
for seq in tree_paths(root, []):
print(seq)
print('\nTracks')
for seq, depth in tree_tracks(root, []):
print('{}{}'.format(' ' * 4 * depth, seq))
<强>输出强>
Paths
[127466, 127460, 127450, 127474, 127481, 127487, 127498, 127510, 127516, 127530, 127542, 127554, 127560]
[127466, 127460, 127450, 127442, 127432, 127426, 127420, 127410, 127402, 127390, 127383]
Tracks
[127466, 127460, 127450]
[127474, 127481, 127487, 127498, 127510, 127516, 127530, 127542, 127554, 127560]
[127442, 127432, 127426, 127420, 127410, 127402, 127390, 127383]
如果您已经知道根节点,那么显然可以省略parents
dict的构造以及搜索根节点的循环。
如果您不需要曲目的缩进输出,则可以使用不使用或生成tree_tracks
的更简单的depth
版本。
def tree_tracks(node, head):
newhead = head + [node]
if node not in children:
yield newhead
return
descendants = children[node]
if len(descendants) > 1:
yield newhead
newhead = []
for n in descendants:
yield from tree_tracks(n, newhead)
此代码可以处理更复杂的树。这是一个示例运行,它在树数据中添加了一些额外的分支。
data = '''\
127466 127460
127460 127450
127450 127474
127450 127442
127474 127481
127442 127432
127481 127487
127432 127426
127426 127420
127487 127498
127420 127410
127498 127510
127510 127516
127410 127402
127516 127530
127530 127542
127402 127390
127542 127554
127390 127383
127554 127560
127510 1
1 2
2 3
3 4
127516 11
11 12
12 13
'''.splitlines()
<强>输出强>
Paths
[127466, 127460, 127450, 127474, 127481, 127487, 127498, 127510, 127516, 127530, 127542, 127554, 127560]
[127466, 127460, 127450, 127474, 127481, 127487, 127498, 127510, 127516, 11, 12, 13]
[127466, 127460, 127450, 127474, 127481, 127487, 127498, 127510, 1, 2, 3, 4]
[127466, 127460, 127450, 127442, 127432, 127426, 127420, 127410, 127402, 127390, 127383]
Tracks
[127466, 127460, 127450]
[127474, 127481, 127487, 127498, 127510]
[127516]
[127530, 127542, 127554, 127560]
[11, 12, 13]
[1, 2, 3, 4]
FWIW,这是我用来生成数据下面树形图的Graphviz DOT文件。
strict digraph test{
127466 -> 127460;
127460 -> 127450;
127450 -> 127474;
127450 -> 127442;
127474 -> 127481;
127442 -> 127432;
127481 -> 127487;
127432 -> 127426;
127426 -> 127420;
127487 -> 127498;
127420 -> 127410;
127498 -> 127510;
127510 -> 127516;
127410 -> 127402;
127516 -> 127530;
127530 -> 127542;
127402 -> 127390;
127542 -> 127554;
127390 -> 127383;
127554 -> 127560;
}