我不经常在这里提问,因为我经常尝试自己解决问题(使用许多stackoverflow线程)。但是,我现在被困住了,我会感激一些帮助。
目标:
我想使用分类单位1)丰度表(丰度)执行mvabund分析:
ID GoodBacteria BadBacteria UnknownBacteria SomeBacteria
Dog 0 7337 101 0
Cat 0 4178.5 0 0
Horse 2 4294.333333 35.66666667 0
Snail 0 4350 27.5 0
Bird 0.5 4332.5 46 0
Whale 0.666666667 4809.666667 13.66666667 0
Fish 1 1522 29 0
Human 0 4679.4 28.46666667 0.033333333
和2)环境参数文件(因子):
ID Mutualistic Commensalistic Parasitic
Dog YES NO NO
Cat YES YES YES
Horse NO NO NO
Snail NO NO YES
Bird YES YES NO
Whale YES NO YES
Fish NO NO NO
Human YES NO NO
但是,原始环境参数文件包含> 80个因素,我们希望单独测试它们。
通常,mvabund分析分为三个功能:
library(mvabund)
mva <- mvabund(abundance)
mod <- manyglm(mva ~ factor(factors$Mutualistic), family="negative.binominal")
aov <- anova(mod, p.uni="adjusted")
aov输出看起来像这样:
Multivariate test:
Res.Df Df.diff Dev Pr(>Dev)
(Intercept) 7
factor(factors$Salinity) 6 1 7.983 0.101
Univariate Tests:
GoodBacteria BadBacteria UnknownBacteria SomeBacteria
Dev Pr(>Dev) Dev Pr(>Dev) Dev Pr(>Dev) Dev Pr(>Dev)
(Intercept)
factor(factors$Salinity) 5.489 0.090 2.41 0.259 0.085 0.770 0 1.000
第一个解决方案:
我的第一个目标是通过使用循环或类似的东西逐列执行所有三个测试(即,Mutualistic,Commensalistic和Parasitic)。这是我没有太多经验的东西,但我通过以下帖子解决了这个问题:Column wise granger's causal tests in R
这是我采用的sapply功能,它起作用:
sapply(1:ncol(factors), function(i) {
m1 <- anova((manyglm(mva ~ factor(factors[,i]), family="negative.binominal")), p.uni="adjusted")
list(multip=m1$table[c(3,4)])
})
它为每列执行anova.manyglm,我甚至可以为每列提取多变量p值 - 不完美,但它有效:
$multip
Dev Pr(>Dev)
(Intercept) NA NA
factor(factors[, i]) 7.983104 0.112
$multip
Dev Pr(>Dev)
(Intercept) NA NA
factor(factors[, i]) 1.862846 0.702
$multip
Dev Pr(>Dev)
(Intercept) NA NA
factor(factors[, i]) 5.655806 0.228
问题:
但是,我还希望获得每个物种和每个因子的单变量结果。这就是我现在挣扎的地方。
anova输出的结构如下:
Length Class Mode
family 1 -none- character
p.uni 1 -none- character
test 1 -none- character
cor.type 1 -none- character
resamp 1 -none- character
nBoot 1 -none- numeric
shrink.param 2 -none- numeric
n.bootsdone 1 -none- numeric
table 4 data.frame list
uni.p 8 -none- numeric
uni.test 8 -none- numeric
uni.p contains the p values and species names (e.g., GoodBacteria) and uni.test the Dev values and species names. But I still don't understand how I can extract these values together with the sapply function above in order to store everything in one output or dataframe.
非常感谢任何帮助。
更新
我稍微更改了脚本
sapply(1:ncol(factors), function(i) {
m1 <- anova((manyglm(mva ~ factor(factors[,i]), family="negative.binominal")), p.uni="adjusted")
unlist(data.frame(dev_p=m1$table[c(3,4)], uni_p=m1$uni.p, uni_dev=m1$uni.test))
})
输出现在看起来像这样,它并不完美,但它是正确的方向:
[,1] [,2] [,3]
d.Dev1 NA NA NA
d.Dev2 7.98310400 1.86284590 5.6558056350
d.Pr..Dev.1 NA NA NA
d.Pr..Dev.2 0.08200000 0.64200000 0.2260000000
uni.GoodBacteria1 NA NA NA
uni.GoodBacteria2 0.08600000 0.62000000 0.2850000000
uni.BadBacteria1 NA NA NA
uni.BadBacteria2 0.25500000 0.88200000 0.9900000000
uni.UnknownBacteria1 NA NA NA
uni.UnknownBacteria2 0.78500000 0.78800000 0.2440000000
uni.SomeBacteria1 NA NA NA
uni.SomeBacteria2 1.00000000 1.00000000 1.0000000000
unidev.GoodBacteria1 NA NA NA
unidev.GoodBacteria2 5.48864799 1.44833012 2.4419358477
unidev.BadBacteria1 NA NA NA
unidev.BadBacteria2 2.40968141 0.03306757 0.0001129865
unidev.UnknownBacteria1 NA NA NA
unidev.UnknownBacteria2 0.08477459 0.38144821 3.2137568008
unidev.SomeBacteria1 NA NA NA
unidev.SomeBacteria2 0.00000000 0.00000000 0.0000000000