问候和致意,
我试图找到开放生命之树(OTOL)中与其他物种的相对距离。我正在使用phytools
R包的fastDist()
函数来生成树的分支之间的计数。但是,该函数会在祖先上产生错误。
调试信息:
Error in while (currnode != rt) { : argument is of length zero
4 getAncestors(tree, sp1)
3 fastHeight(tree, sp2, sp2)
2 phytools::fastDist(tree, resolved_names_proper[i], resolved_names_proper[j])
1 get_distance(tree, species = c("Abies grandis", "Abies concolor",
"Abies lasiocarpa"))
有问题的代码是:
phytools:::getAncestors = function (tree, node, type = c("all", "parent"))
{
if (!inherits(tree, "phylo"))
stop("tree should be an object of class \"phylo\".")
type <- type[1]
if (type == "all") {
aa <- vector()
rt <- length(tree$tip.label) + 1
currnode <- node
while (currnode != rt) { #### error here
currnode <- getAncestors(tree, currnode, "parent")
aa <- c(aa, currnode)
}
return(aa)
}
else if (type == "parent") {
aa <- tree$edge[which(tree$edge[, 2] == node), 1]
return(aa)
}
else stop("do not recognize type")
}
树信息
Phylogenetic tree with 304959 tips and 23328 internal nodes.
Tip labels:
Leucas_martinicensis_ott9739, Leucas_deflexa_var_deflexa_ott531221, Leonotis_ocymifolia_var_schinzii_ott480842, Leonotis_ocymifolia_var_raineriana_ott480829, Leonotis_nepetifolia_var_africana_ott480834, Leonotis_nepetifolia_var_nepetifolia_ott480833, ...
Node labels:
Chloroplastida_ott361838, Streptophyta_ott916750, , , , Embryophyta_ott5342313, ...
Unrooted; includes branch lengths.
树是否可能没有指定正确的节点标签? (例如,某些节点标签为空?)例如,tnrs_match_names('Abies lasiocarpa')
会返回一个值,但tree$node.label
和tree$tip.label
找不到任何内容。
当我试图找到同一属(Abies)中分支之间的距离时,会给出导致此错误的具体示例。目前,我使用tryCatch()来继续构建矩阵的过程。但是,获得一些价值会很棒。
MWE:
## Initialize Data
# Any package that is required by the script below is given here:
inst_pkgs = load_pkgs = c("ape","phytools","R.utils","rotl")
inst_pkgs = inst_pkgs[!(inst_pkgs %in% installed.packages()[,"Package"])]
if(length(inst_pkgs)) install.packages(inst_pkgs)
# Dynamically load packages
pkgs_loaded = lapply(load_pkgs, require, character.only=T)
# Grab the Chloroplastida tree
input_tree = file.path(tempdir(), "chloroplastida.tre.gz")
download.file(url="http://files.opentreeoflife.org/trees/v3subtrees/chloroplastida.tre.gz",destfile=input_tree)
gunzip(input_tree)
input_tree_final = dir(tempdir(), pattern=glob2rx("*.tre"),full.names=T)
# Now read in tree as an phylo object (from ape)
MyTree = read.tree(input_tree_final)
# List of Species
species = c("Abies amabilis","Abies concolor","Abies lasiocarpa")
# Look up "proper" names of species used in tree:
resolved_names = tnrs_match_names(species) # Finds the matching names...
## Output of resolved_names
## search_string unique_name approximate_match ott_id is_synonym is_deprecated number_matches
##1 abies amabilis Abies amabilis FALSE 876303 FALSE FALSE 1
##2 abies concolor Abies concolor FALSE 876315 FALSE FALSE 1
##3 abies lasiocarpa Abies lasiocarpa FALSE 85998 FALSE FALSE 1
# Make taxa names for querying the tree:
resolved_names_proper = paste(gsub(" ","_",resolved_names$unique_name),"_ott",resolved_names$ott_id,sep="")
## Output of resolved_names_proper
## "Abies_amabilis_ott876303" "Abies_concolor_ott876315" "Abies_lasiocarpa_ott85998"
# Single tests between species (can be used so you don't need to pre-calculate all species):
test_distance_ok = fastDist(MyTree,resolved_names_proper[1],resolved_names_proper[2])
test_distance_bad = fastDist(MyTree,resolved_names_proper[1],resolved_names_proper[3])
产生的距离矩阵:
Abies amabilis Abies concolor Abies lasiocarpa
Abies amabilis 0 4 NA
Abies concolor 4 0 NA
Abies lasiocarpa NA NA 0
修改
使用rotl
包构建我收到的树:
resolved_names = tnrs_match_names(species)
tr = tol_induced_subtree(ott_ids=resolved_names$ott_id)
树构建完毕:
# tr
##
## Phylogenetic tree with 3 tips and 2 internal nodes.
##
## Tip labels:
## [1] "Abies_lasiocarpa_ott85998" "Abies_amabilis_ott876303" "Abies_concolor_ott876315"
##
## Rooted; no branch lengths.
但是,我丢失了分支机构的信息。因此出现了一个新错误:
Error in phytools::fastDist(tree, resolved_names_proper[i], resolved_names_proper[j]) :
tree should have edge lengths.
3 stop("tree should have edge lengths.")
2 phytools::fastDist(tree, resolved_names_proper[i], resolved_names_proper[j])
1 get_distance(tr, species)
我试图直接获得叶绿体树,但API不会返回它。 :
m = tnrs_match_names("chloroplastida")
tree = tol_subtree(ott_id = m$ott_id[1])
有错误信息:
Error in otl_check_error(req) :
Message: Requested tree is larger than currently allowed by this service (25000 tips). For larger trees, please download the full tree directly from: http://files.opentreeoflife.org/trees/
因此,直接下载上面的子树。
此外,如果我尝试下载并加载完整的草稿v3或v4树,我会收到:
# Grab the entire tree
input_tree = file.path(tempdir(), "draftversion3.tre.gz")
download.file(url="http://files.opentreeoflife.org/trees/draftversion3.tre.gz",destfile=input_tree)
gunzip(input_tree)
input_tree_final = dir(tempdir(), pattern=glob2rx("*.tre"),full.names=T)
# Now read in tree as an phylo object (from ape)
MyTree = read.tree(input_tree_final)
返回错误消息:
Error in if (sum(obj[[i]]$edge[, 1] == ROOT) == 1 && dim(obj[[i]]$edge)[1] > :
missing value where TRUE/FALSE needed
答案 0 :(得分:0)
我已经为.tre文件编写了一个解析器,事实上,在JS中编写了draftversion3,我理解了这个问题,但是这个代码对我来说完全陌生,我只会解释解析和添加理论以获得距离...
物种和节点与某些标志区分开来,如OTT00000 ott和数字和括号。
如果我必须用简单的代码做你所要求的,我可以相当容易地做到,通过在括号中向前计数来找到两个括号之间的物种数量,anc向后计数以找到整个树结构包含它们......
简单的任务,粗略地说你只计算(+1和)-1。
一旦你拥有了两个树的整个树,就可以相对容易地在节点上回溯并将两个物种的所有距离加在一起,直到它们的共享树节点。
在代码提供的额外功能中,我不能说功能是什么,但基本逻辑非常简单。希望有助于找到错误。
我使用类似OTT0000类型标记的符号来区分物种和节点。