使用python模块dendropy打印tip标签来计算系统发育树上节点之间的成对距离?

时间:2017-03-28 09:33:34

标签: python tree phylogeny dendropy

我试图在python中创建一个数组,该数组将包含系统发生树上每对节点之间的所有成对距离。我目前正在使用dendropy来做到这一点。 (我最初看过biopython,但无法找到这样做的选项)。到目前为止我的代码看起来像这样:

import dendropy

tree_data = []
tree = dendropy.Tree.get(path="gonno_microreact_tree.nwk",schema="newick")
pdc = tree.phylogenetic_distance_matrix()
for i, t1 in enumerate(tree.taxon_namespace[:-1]):
    for t2 in tree.taxon_namespace[i+1:]:
        tip_pair = {}
        tip_dist_list = []
        tip_pair[t1] = t2
        distance = pdc(t1, t2)
        tip_dist_list.append(tip_pair)
        tip_dist_list.append(distance)
        tree_data.append(tip_dist_list)
print tree_data

除了编写提示标签的方式外,这种方法效果很好。例如,tree_data列表中的条目如下所示:

[{<Taxon 0x7fc4c160b090 'ERS135651'>: <Taxon 0x7fc4c160b150 'ERS135335'>}, 0.0001294946558138355]

但是newick文件中的提示分别标记为ERS135651和ERS135335。我怎么能得到dendropy只用原始的tip标签来编写数组,所以这个条目看起来像这样:

 [{ERS135651:ERS135335}, 0.0001294946558138355]

(我也阅读了dendropy文档,我知道它说使用treecalc来做这个,就像这样:

pdc = treecalc.PatristicDistanceMatrix(tree)

但是我得到一个错误,说该命令不存在:

AttributeError: 'module' object has no attribute 'PairisticDistanceMatrix'

有关如何使其正常工作的任何建议?

2 个答案:

答案 0 :(得分:0)

将提示标签转换为字符串将其转换为由语音标记包围的名称,例如:

Customer customer; // assume its initialized
CustomerDto custDTO; 
var recordsAreDifferent = false;
foreach (var prop in custDTO.GetType().GetProperties())
{
  PropertyInfo customerProperty = customer.GetType().GetProperty(prop.name);
  if(customerProperty == null) {continue;}
 if(!prop.GetValue(custDTO, null).Equals(customerProperty.GetValue(customer,    null)) {
     recordsAreDifferent = true;
  }
}

给出:

t1 = str(t1)
print t1

因此,使用字符串拼接来删除额外的语音标记可以将提示标签转换回它的正确名称,例如:

"'ERS135651'"

答案 1 :(得分:0)

这很晚了,但是由于我遇到了类似的问题(试图根据标签确定分类单元/样本之间的距离)并且此处的解决方案不适用于我的情况,所以我想我可以分享一下:

import dendropy as ddp

# read tree
tree = ddp.Tree.get(path="pythonidae.mle.nex",
                         schema="nexus")
# produce tree distance matrix
pdm = tree.phylogenetic_distance_matrix()

distances = {}  # {(sample_1, sample_2): distance, ...}
# extract weighed patristic distance for each sample pair and labels for
# corresponding taxons
for taxon1 in tree.taxon_namespace:
    for taxon2 in tree.taxon_namespace:
        weighted_patristic_distance = pdm.patristic_distance(taxon1, taxon2)
        distances[(taxon1.label, 
                   taxon2.label)] = weighted_patristic_distance

我是堆栈溢出的新手,请多多包涵!