我有一个未排序的父子层次结构文件(制表符分隔),格式如下:
City1 Area1
City1 Area2
Continent1 Country1
Continent2 Country2
Continent3 Country3
Continent4 Country4
Continents Continent1
Continents Continent2
Continents Continent3
Continents Continent4
Country1 State1
Country2 State2
Country3 State3
Earth Continents
State1 City1
State1 City1.1
State2 City2
我的目标是找到所有成员的“后代”和“祖先”。
以下是我编写的内容:
import sys, re
with open("input.txt", "r") as my_in:
collections={}
for line in my_in:
parent, child=line.rstrip('\r\n').split('\t')
collections.setdefault(parent, []).append(child)
print (collections)
'''
{'Continent4': ['Country4'], 'Continent2': ['Country2'],
'Continents': ['Continent1', 'Continent2', 'Continent3', 'Continent4'],
'Continent1': ['Country1'], 'Country2': ['State2'],
'Country3': ['State3'], 'State1': ['City1', 'City1.1'],
'Country1': ['State1'], 'State2': ['City2'],
'Earth': ['Continents'], 'City1': ['Area1', 'Area2'], 'Continent3': ['Country3']}
'''
def find_descendants(parent, collections):
descendants = []
for descendant in collections[parent]:
if descendant in collections:
descendants = descendants + find_descendants(descendant, collections)
else:
descendants.append(descendant)
return descendants
# Get descendants of "Continent1":
lis=find_descendants("Continent1", collections)
print (lis) # It shows ['Area1', 'Area2', 'City1.1']
# Actually it should show ['Country1', 'State1', 'City1', 'Area1', 'Area2', 'City1.1']
def find_ancestors(child, collections):
# pseudo code
# link child to its parent and parent to its parent until no more parents are found
pass
# lis=find_ancestors("City1.1", collections)
# should show ['Earth', 'Continents', 'Continent1', 'Country1', 'State1']
函数find_descendants未按预期工作。就find_ancestors函数而言,虽然我知道伪代码,但我无法用Python表达它。
请帮忙。
答案 0 :(得分:1)
正如我在评论中所说的那样,在你看之前你忘了追你的后代 更深入的收藏。这有效:
def find_descendants(parent, collections):
descendants = []
for descendant in collections[parent]:
descendants.append(descendant)
if descendant in collections:
descendants = descendants + find_descendants(descendant, collections)
return descendants
对于祖先,只需构建另一个collections
,比如ancestors_collection
,它存储反向关系后代/祖先。然后,查找祖先的函数应与find_descendants完全相同,您可以相应地重命名。
编辑:
这是一个完整的工作代码,我使用
relative
来引用祖先或后代:
import sys, re
with open("input.txt", "r") as my_in:
descendants={}
ancestors={}
for line in my_in:
parent, child=line.rstrip('\r\n').split('\t')
descendants.setdefault(parent, []).append(child)
ancestors.setdefault(child, []).append(parent)
def get_relatives(element, collection):
relatives = []
for relative in collection[element]:
relatives.append(relative)
if relative in collection:
relatives = relatives + get_relatives(relative, collection)
return relatives
# Get descendants of "Continent1":
lis=get_relatives("Continent1", descendants)
print (lis)
# shows ['Country1', 'State1', 'City1', 'Area1', 'Area2', 'City1.1']
lis=get_relatives("City1.1", ancestors)
print (lis)
# shows ['Earth', 'Continents', 'Continent1', 'Country1', 'State1']
答案 1 :(得分:0)
Here's a simpler solution that uses networkx
:
import networkx as nx
coll = nx.DiGraph()
with open("input.txt") as f:
for line in map(str.strip, f):
ancestor, descendant = line.split("\t")
coll.add_edge(ancestor, descendant)
print(nx.descendants(coll, "Continent1"))
# {'Area2', 'City1.1', 'Area1', 'City1', 'State1', 'Country1'}
print(nx.ancestors(coll, "City1.1"))
# {'Earth', 'Continent1', 'State1', 'Continents', 'Country1'}
Both functions return a set so the ancestors and descendants are not ordered.