对于系统发育回归模型,我还比较陌生。过去,当树中每个物种只有1个条目时,我使用PGLS。现在,我有一个包含数千个记录的数据集,总共有9种物种,我想运行一个系统发育模型。我阅读了最常用软件包的教程(例如,雀跃),但不确定如何构建模型。
当我尝试创建用于雀跃的对象时,即使用:
obj <- comparative.data(phy = Tree, data = Data, names.col = species, vcv = TRUE, na.omit = FALSE, warn.dropped = TRUE)
我收到消息:
row.names<-.data.frame
(*tmp*
,值=值)中的错误: 不允许重复的“ row.names” 另外:警告消息: 设置“ row.names”时的非唯一值:“ Species1”,“ Species2”,“ Species3”,“ Species4”,“ Species5”,“ Species6”,“ Species7”,“ Species8”,“ Species9”
我知道我可以通过应用MCMCglmm模型来解决此问题,但是我不熟悉贝叶斯模型。
在此先感谢您的帮助。
答案 0 :(得分:0)
这确实不适用于caper
中的简单PGLS,因为它不能作为随机效应来对待个人。我建议您使用MCMCglmm
,它理解起来并不复杂,并且可以使您具有随机效果。您可以从软件包的作者here或here中找到出色的文档,也可以从替代文档中找到更多有关软件包某些特定方面(即树的不确定性)的文档here。
真的很简短,可以帮助您:
## Your comparative data
comp_data <- comparative.data(phy = my_tree, data =my_data,
names.col = species, vcv = TRUE)
请注意,您可以拥有一个如下所示的标本栏:
taxa var1 var2 specimen
1 A 0.08730689 a spec1
2 B 0.47092692 a spec1
3 C -0.26302706 b spec1
4 D 0.95807782 b spec1
5 E 2.71590217 b spec1
6 A -0.40752058 a spec2
7 B -1.37192856 a spec2
8 C 0.30634567 b spec2
9 D -0.49828379 b spec2
10 E 1.42722363 b spec2
然后您可以设置公式(类似于简单的lm
公式):
## Your formula
my_formula <- variable1 ~ variable2
以及您的MCMC设置:
## Setting the prior list (see the MCMCglmm course notes for details)
prior <- list(R = list(V=1, nu=0.002),
G = list(G1 = list(V=1, nu=0.002)))
## Setting the MCMC parameters
## Number of interations
nitt <- 12000
## Length of burnin
burnin <- 2000
## Amount of thinning
thin <- 5
然后您应该可以运行默认的MCMCglmm
:
## Extracting the comparative data
mcmc_data <- comp_data$data
## As MCMCglmm requires a colume named animal for it to identify it as a phylo
## model we include an extra colume with the species names in it.
mcmc_data <- cbind(animal = rownames(mcmc_data), mcmc_data)
mcmc_tree <- comp_data$phy
## The MCMCglmmm
mod_mcmc <- MCMCglmm(fixed = my_formual,
random = ~ animal + specimen,
family = "gaussian",
pedigree = mcmc_tree,
data = mcmc_data,
nitt = nitt,
burnin = burnin,
thin = thin,
prior = prior)