python中嵌套树字典的common_ancestor函数

时间:2017-12-10 06:18:38

标签: python list dictionary

我正在尝试创建一个名为“common_ancestor()”的函数,它接受两个输入:第一个是字符串分类名称列表,第二个是系统发育树字典。它应该返回一个字符串,给出所有分类中最接近共同祖先的分类单元的名称 物种在输入清单中。已经创建了一个名为“list_ancestors”的独立函数,它为我提供了列表中元素的一般祖先。另外,有一本我正在使用的词典。

    tax_dict = { 
'Pan troglodytes': 'Hominoidea',       'Pongo abelii': 'Hominoidea', 
'Hominoidea': 'Simiiformes',           'Simiiformes': 'Haplorrhini', 
'Tarsius tarsier': 'Tarsiiformes',     'Haplorrhini': 'Primates',
'Tarsiiformes': 'Haplorrhini',         'Loris tardigradus':'Lorisidae',
'Lorisidae': 'Strepsirrhini',          'Strepsirrhini': 'Primates',
'Allocebus trichotis': 'Lemuriformes', 'Lemuriformes': 'Strepsirrhini',
'Galago alleni': 'Lorisiformes',       'Lorisiformes': 'Strepsirrhini',
'Galago moholi': 'Lorisiformes'
} 

def halfroot(tree):
    taxon = random.choice(list(tree))
    result = [taxon]
    for i in range(0,len(tree)): 
        result.append(tree.get(taxon))
        taxon = tree.get(taxon)
    return result


def root(tree):
    rootlist = halfroot(tree)
    rootlist2 = rootlist[::-1]
    newlist = []
    for e in range(0,len(rootlist)):
        if rootlist2[e] != None:
        newlist.append(rootlist2[e])
    return newlist[0]


def list_ancestors(taxon, tree):
    result = [taxon]
    while taxon != root(tree):
        result.append(tree.get(taxon))
        taxon = tree.get(taxon)
    return result

def common_ancestors(inputlist,tree)
    biglist1 = []
    for i in range(0,len(listname)):
        biglist1.append(list_ancestors(listname[i],tree))
        "continue so that I get three separate lists where i can cross reference all elements from the first list to every other list to find a common ancestor "

结果应该类似于

  print(common_ancestor([’Hominoidea’, ’Pan troglodytes’,’Lorisiformes’], tax_dict)
  Output: ’Primates’"

1 个答案:

答案 0 :(得分:0)

一种方法是收集每个物种的所有祖先,将它们放在一个集合中然后得到一个交叉点以获得它们的共同点:

def common_ancestor(species_list, tree):
    result = None  # initiate a `None` result
    for species in species_list:  # loop through each species in the species_list
        ancestors = {species}  # initiate the ancestors set with the species itself
        while True:  # rinse & repeat until there are leaves in the ancestral tree
            try:
                species = tree[species]  # get the species' ancestor
                ancestors.add(species)  # store it in the ancestors set
            except KeyError:
                break
        # initiate the result or intersect it with ancestors from the previous species
        result = ancestors if result is None else result & ancestors
    # finally, return the ancestor if there is only one in the result, or None
    return result.pop() if result and len(result) == 1 else None

print(common_ancestor(["Hominoidea", "Pan troglodytes", "Lorisiformes"], tax_dict))
# Primates

您可以使用'中间'也是list_ancestors()的这个函数的一部分 - 没有必要通过试图找到树的根来使它复杂化:

def list_ancestors(species, tree, include_self=True):
    ancestors = [species] if include_self else []
    while True:
        try:
            species = tree[species]
            ancestors.append(species)
        except KeyError:
            break
    return ancestors

当然,两者都依赖于一个有效的祖先树词典 - 如果一些祖先要自己递归,或者如果链条出现断裂则它不会起作用。此外,如果你要做很多这些操作,将平面字典变成合适的树可能是值得的。