我有两个有生物列表的文件。第一个文件包含一个表示“Family Genus”的列表,因此有两列。第二个文件包含'Genus species',也有两列。两个文件都符合所有列出物种的属。我想使用每个文件的Genus合并两个列表,以便能够将姓氏添加到'Genus species'。因此,输出应包含'家族属物种'。由于每个名称之间都有一个空格,我使用该空格分割成列。到目前为止,这是我的代码:
with open('FAMILY_GENUS.TXT') as f1, open('GENUS_SPECIES.TXT') as f2:
for line1 in f1:
line1 = line1.strip()
c1 = line1.split(' ')
print(line1, end=' ')
for line2 in f2:
line2 = line2.strip()
c2 = line2.split(' ')
if line1[1] == line2[0]:
print(line2[1], end=' ')
print()
结果输出仅由两行组成,而不是整个记录。我错过了什么?
另外,如何将其保存到文件而不是仅仅在屏幕上打印?
答案 0 :(得分:3)
这是另一种解决方案。
f1 = open('fg','r')
f2 = open('gs','r')
genera= {}
for i in f1.readlines():
family,genus = i.strip().split(" ")
genera[genus] = family
for i in f2.readlines():
genus,species = i.strip().split(" ")
print(genera[genus], genus,species)
答案 1 :(得分:0)
我会先处理这些文件,然后获取属于家族和它可能包含的多个物种的映射。然后使用该映射将它们匹配并打印出来。
genuses = {}
# Map all genuses to a family
with open('FAMILY_GENUS.TXT') as f1:
for line in f1:
family, genus = line.strip().split()
genuses.setdefault(genus, {})['family'] = family
# Map all species to a genus
with open('GENUS_SPECIES.TXT') as f2:
for line in f2:
genus, species = line.strip().split()
genuses.setdefault(genus, {}).setdefault('species', []).append(species)
# Go through each genus and create a specie string for
# each specie it contains.
species_strings = []
for genus, d in genuses.items():
family = d.get('family')
species = d.get('species')
if family and species:
for specie in species:
s = '{0} {1} {2}'.format(family, genus, specie)
species_strings.append(s)
# Sort the strings to make the output pretty and print them out.
species_strings.sort()
for s in species_strings:
print s