在R中重复逐列计算并保存列表中的输出

时间:2017-10-09 13:04:03

标签: r dataframe multivariate-testing

我不经常在这里提问,因为我经常尝试自己解决问题(使用许多stackoverflow线程)。但是,我现在被困住了,我会感激一些帮助。

目标:

我想使用分类单位1)丰度表(丰度)执行mvabund分析:

ID  GoodBacteria    BadBacteria UnknownBacteria SomeBacteria
Dog 0   7337    101 0
Cat 0   4178.5  0   0
Horse   2   4294.333333 35.66666667 0
Snail   0   4350    27.5    0
Bird    0.5 4332.5  46  0
Whale   0.666666667 4809.666667 13.66666667 0
Fish    1   1522    29  0
Human   0   4679.4  28.46666667 0.033333333

和2)环境参数文件(因子):

ID  Mutualistic Commensalistic  Parasitic
Dog YES NO  NO
Cat YES YES YES
Horse   NO  NO  NO
Snail   NO  NO  YES
Bird    YES YES NO
Whale   YES NO  YES
Fish    NO  NO  NO
Human   YES NO  NO

但是,原始环境参数文件包含> 80个因素,我们希望单独测试它们。

通常,mvabund分析分为三个功能:

library(mvabund)    
mva <- mvabund(abundance)
mod <- manyglm(mva ~ factor(factors$Mutualistic), family="negative.binominal")
aov <- anova(mod, p.uni="adjusted")

aov输出看起来像这样:

Multivariate test:
        Res.Df  Df.diff Dev Pr(>Dev)
(Intercept) 7                       
factor(factors$Salinity)    6   1   7.983   0.101

Univariate Tests:
        GoodBacteria    BadBacteria UnknownBacteria SomeBacteria       
        Dev Pr(>Dev)    Dev Pr(>Dev)    Dev Pr(>Dev)    Dev Pr(>Dev)
(Intercept)
factor(factors$Salinity)    5.489 0.090 2.41 0.259  0.085 0.770 0 1.000   

第一个解决方案:

我的第一个目标是通过使用循环或类似的东西逐列执行所有三个测试(即,Mutualistic,Commensalistic和Parasitic)。这是我没有太多经验的东西,但我通过以下帖子解决了这个问题:Column wise granger's causal tests in R

这是我采用的sapply功能,它起作用:

sapply(1:ncol(factors), function(i) {
  m1 <- anova((manyglm(mva ~ factor(factors[,i]), family="negative.binominal")), p.uni="adjusted")
  list(multip=m1$table[c(3,4)])
  })

它为每列执行anova.manyglm,我甚至可以为每列提取多变量p值 - 不完美,但它有效:

$multip
                          Dev Pr(>Dev)
(Intercept)                NA       NA
factor(factors[, i]) 7.983104    0.112

$multip
                          Dev Pr(>Dev)
(Intercept)                NA       NA
factor(factors[, i]) 1.862846    0.702

$multip
                          Dev Pr(>Dev)
(Intercept)                NA       NA
factor(factors[, i]) 5.655806    0.228

问题:

但是,我还希望获得每个物种和每个因子的单变量结果。这就是我现在挣扎的地方。

anova输出的结构如下:

    Length Class      Mode     
    family       1      -none-     character
    p.uni        1      -none-     character
    test         1      -none-     character
    cor.type     1      -none-     character
    resamp       1      -none-     character
    nBoot        1      -none-     numeric  
    shrink.param 2      -none-     numeric  
    n.bootsdone  1      -none-     numeric  
    table        4      data.frame list     
    uni.p        8      -none-     numeric  
    uni.test     8      -none-     numeric 

uni.p contains the p values and species names (e.g., GoodBacteria) and uni.test the Dev values and species names. But I still don't understand how I can extract these values together with the sapply function above in order to store everything in one output or dataframe.

非常感谢任何帮助。

更新

我稍微更改了脚本

    sapply(1:ncol(factors), function(i) {
      m1 <- anova((manyglm(mva ~ factor(factors[,i]), family="negative.binominal")), p.uni="adjusted")
      unlist(data.frame(dev_p=m1$table[c(3,4)], uni_p=m1$uni.p, uni_dev=m1$uni.test))
    })

输出现在看起来像这样,它并不完美,但它是正确的方向:

                              [,1]       [,2]         [,3]
d.Dev1                          NA         NA           NA
d.Dev2                  7.98310400 1.86284590 5.6558056350
d.Pr..Dev.1                     NA         NA           NA
d.Pr..Dev.2             0.08200000 0.64200000 0.2260000000
uni.GoodBacteria1               NA         NA           NA
uni.GoodBacteria2       0.08600000 0.62000000 0.2850000000
uni.BadBacteria1                NA         NA           NA
uni.BadBacteria2        0.25500000 0.88200000 0.9900000000
uni.UnknownBacteria1            NA         NA           NA
uni.UnknownBacteria2    0.78500000 0.78800000 0.2440000000
uni.SomeBacteria1               NA         NA           NA
uni.SomeBacteria2       1.00000000 1.00000000 1.0000000000
unidev.GoodBacteria1            NA         NA           NA
unidev.GoodBacteria2    5.48864799 1.44833012 2.4419358477
unidev.BadBacteria1             NA         NA           NA
unidev.BadBacteria2     2.40968141 0.03306757 0.0001129865
unidev.UnknownBacteria1         NA         NA           NA
unidev.UnknownBacteria2 0.08477459 0.38144821 3.2137568008
unidev.SomeBacteria1            NA         NA           NA
unidev.SomeBacteria2    0.00000000 0.00000000 0.0000000000

0 个答案:

没有答案