我想在给定公差截止值的情况下折叠dendrogram
的分支。
我正在关注dendextend
的{{1}} example。
collapse_branch
与require(dendextend)
dend <- iris[1:5,-5] %>% dist %>% hclust %>% as.dendrogram
dend %>% ladderize %>% plot(horiz = TRUE); abline(v = .2, col = 2, lty = 2)
的{{3}}中的dendrogram
不同,我想用三角形替换所有折叠的分支(即任何直到红色垂直虚线的分支) ,类似于此图中的分支方式(来自example:
如果这个问题太多,我会决定在公差截止时切割树枝。
答案 0 :(得分:2)
获取三角形确实有点过分,但你可以为分支着色。通过高度或群集数量,使用color_branches
:
library(dendextend)
dend <- iris[1:5,-5] %>% dist %>% hclust %>% as.dendrogram
dend %>% color_branches(h=0.2) %>% ladderize %>% plot(horiz = TRUE); abline(v = .2, col = 2, lty = 2)
# OR
# dend %>% color_branches(k=4) %>% ladderize %>% plot(horiz = TRUE); abline(v = .2, col = 2, lty = 2)
您还可以使用find_k
选择使用轮廓系数(在本例中为2)的群集数量:
require(dendextend)
dend <- iris[1:5,-5] %>% dist %>% hclust %>% as.dendrogram
find_k(dend)$k
dend %>% color_branches(k=find_k(.)$k) %>% ladderize %>% plot(horiz = TRUE); abline(v = .2, col = 2, lty = 2)
答案 1 :(得分:1)
可以使用ape
package
到drop.tip
&#39>:
require(ape)
require(dendextend)
require(data.tree)
dend <- iris[1:5,-5] %>% dist %>% hclust %>% as.dendrogram
tol.level <- 0.28
dend %>% plot(horiz = TRUE); abline(v=tol.level,col="red",lty=2)
因此我们的容忍度为0.28,因此我们希望折叠叶(1,5)
和(3,4)
,因为它们的祖先节点的深度低于tol.level
#convert dendrogram to data.tree
dend.dt <- as.Node(dend)
#get vector of leaves per each internal node
node.list <- lapply(dend.dt$Get(function(node) node$leaves,filterFun = isNotLeaf),function(n) unname(sapply(unlist(n,recursive = T),function(l) l$name)))
#get vector of per each internal node
node.depth.df <- data.frame(depth=c(t(sapply(Traverse(dend.dt,traversal="pre-order",pruneFun=isNotLeaf),function(x) c(x$plotHeight)))),stringsAsFactors=F)
to.drop.leave.names <- c(sapply(which(node.depth.df$depth < tol.level),function(i) node.list[[i]]))
#convert dendrogram to phylo
phylo.dend <- as.phylo(dend)
phylo.dend <- drop.tip(phylo.dend,tip=to.drop.leave.names,interactive=FALSE,trim.internal=FALSE)
plot(phylo.dend,use.edge.length=F)
现在我们可以将其转换回dendrogram
(Chronogram
)
new.dend <- chronos(phylo.dend)