当我在R中计算引导树时,我使用PAST(http://folk.uio.no/ohammer/past/)得到不同的值。如何从两个程序中获得匹配的输出?
这是我在R中做的事情(下面的数据):
library("ape")
library("phytools")
library("phangorn")
library("cluster")
# compute neighbour-joined tree
f <- function(xx) nj(daisy(xx))
nj_tree <- f(tab)
nj_tree_root <- root(nj_tree, 1, r = TRUE)
## bootstrap
# bootstrap values do not match PAST output - why is that?
nj_tree_root_boot <- boot.phylo(nj_tree, FUN = f, tab, rooted = TRUE)
# Are bootstrap values stable?
for (i in 1:10){
print(boot.phylo(nj_tree, FUN = f, tab, rooted = TRUE, quiet = TRUE))
}
# yes, they seem ok
# plot tree with bootstrap values
plot(nj_tree_root, use.edge.length = FALSE)
nodelabels(nj_tree_root_boot, adj = c(1.2, 1.2), frame = "none")
引导程序的典型输出是[1] 100 6 39 27 23 57 53 75 71
,这是图(远LHS值应为100,它以某种方式被裁剪):
我转换数据以将其发送到PAST,如下所示:
tab1 <- t(apply(tab, 1, as.numeric))
write.table(tab1, "tab.txt")
在过去我打开tab.txt文件,做多变量 - &gt;群集 - &gt;邻居使用outgroup加入Euclidian和100个bootstrap复制。从过去我得到这个情节:
价值观非常不同。我需要做什么才能使输出与PAST匹配? PAST错了吗?
数据:
tab <- structure(list(X1 = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 2L, 2L), .Label = c("0", "1"), class = "factor"), X2 = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L), .Label = c("0", "1"), class = "factor"),
X3 = structure(c(1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L,
2L), .Label = c("0", "1"), class = "factor"), X4 = structure(c(2L,
2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L), .Label = c("0",
"1"), class = "factor"), X5 = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 1L, 2L, 1L, 2L, 1L), .Label = c("0", "1"), class = "factor"),
X6 = structure(c(1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L,
2L), .Label = c("0", "1"), class = "factor"), X7 = structure(c(1L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L), .Label = c("0",
"1"), class = "factor"), X8 = structure(c(2L, 2L, 2L, 2L,
1L, 1L, 2L, 2L, 1L, 2L, 2L), .Label = c("0", "1"), class = "factor"),
X9 = structure(c(1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L,
1L), .Label = c("0", "1"), class = "factor"), X10 = structure(c(1L,
1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L), .Label = c("0",
"1"), class = "factor"), X11 = structure(c(1L, 2L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 1L, 2L), .Label = c("0", "1"), class = "factor"),
X12 = structure(c(2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L), .Label = c("0", "1"), class = "factor"), X13 = structure(c(2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("0",
"1"), class = "factor"), X14 = structure(c(2L, 2L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("0", "1"), class = "factor"),
X15 = structure(c(2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L), .Label = c("0", "1"), class = "factor"), X16 = structure(c(2L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L), .Label = c("0",
"1"), class = "factor"), X17 = structure(c(2L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 1L, 1L, 2L), .Label = c("0", "1"), class = "factor"),
X18 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L,
1L), .Label = c("0", "1"), class = "factor"), X19 = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L), .Label = c("0",
"1"), class = "factor"), X20 = structure(c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L), .Label = c("0", "1"), class = "factor"),
X21 = structure(c(1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L), .Label = c("0", "1"), class = "factor"), X22 = structure(c(2L,
2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L), .Label = c("0",
"1"), class = "factor"), X23 = structure(c(1L, 1L, 2L, 1L,
1L, 1L, 1L, 2L, 1L, 2L, 2L), .Label = c("0", "1"), class = "factor"),
X24 = structure(c(1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L,
2L), .Label = c("0", "1"), class = "factor"), X25 = structure(c(1L,
1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L), .Label = c("0",
"1"), class = "factor"), X26 = structure(c(1L, 1L, 2L, 2L,
2L, 1L, 2L, 2L, 1L, 1L, 1L), .Label = c("0", "1"), class = "factor")), .Names = c("X1",
"X2", "X3", "X4", "X5", "X6", "X7", "X8", "X9", "X10", "X11",
"X12", "X13", "X14", "X15", "X16", "X17", "X18", "X19", "X20",
"X21", "X22", "X23", "X24", "X25", "X26"), row.names = c("a",
"b", "c", "d", "e", "f", "g", "h", "i", "j", "k"), class = "data.frame")
答案 0 :(得分:1)
经过大量搜索,结果是答案在ape
包FAQ Q14中:
我用boot.phylo做了一个bootstrap分析,但是有一些bootstrap 生根树后,值似乎在错误的位置。这是因为 bootstrap值被计为clades的频率,而不是 作为实际的分区。所以这些价值观真的与之相关 节点,而不是边缘。结果是一些引导程序 在(重新)生根之后,价值观会失去意义 因为这会影响树中进化枝的定义。一个 简单的解决方案是将生根过程包含在定义中 函数FUN作为boot.phylo的参数给出。明显 在做之前,估计的树也必须以相同的方式植根 引导程序。在这种情况下,定义FUN更方便 预先。示例代码为:
outgroup <- 1 # may be several tips, numeric or tip labels
foo <- function(xx) root(nj(dist.dna(xx)), outgroup)
tr <- foo(X) # X is the matrix of DNA sequences
bp <- boot.phylo(tr, X, foo)
plot(tr)
nodelabels(bp) # will have "100" at the root
在我的问题的具体情况中:
nj_tree_root_boot <- boot.phylo(nj_tree, FUN = f, tab, rooted = TRUE)
plot(nj_tree_root, use.edge.length = FALSE)
nodelabels(nj_tree_root_boot, adj = c(1.2, 1.2), frame = "none")
与PAST输出相匹配。