提取在R

时间:2015-11-27 20:41:07

标签: r

我试图从R中的aov摘要中提取给定效果的名称和p值。为了使我的问题更清楚,aov摘要如下所示:

Error: subj
          Df Sum Sq Mean Sq F value Pr(>F)
Group      1    9.6   9.585   1.403  0.241
Residuals 58  396.3   6.832               

Error: subj:StimProb
               Df Sum Sq Mean Sq  F value Pr(>F)    
StimProb        1  739.0   739.0 2939.367 <2e-16 ***
StimProb:Group  1    0.2     0.2    0.688   0.41    
Residuals      58   14.6     0.3                    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Error: subj:StimVal
              Df Sum Sq Mean Sq F value Pr(>F)
StimVal        1  0.126 0.12585   0.558  0.458
StimVal:Group  1  0.026 0.02609   0.116  0.735
Residuals     58 13.074 0.22541               

Error: subj:StimProb:StimVal
                       Df Sum Sq Mean Sq F value Pr(>F)
StimProb:StimVal        1  0.255 0.25512   0.820  0.369
StimProb:StimVal:Group  1  0.036 0.03586   0.115  0.735
Residuals              58 18.044 0.31110               

Error: Within
           Df Sum Sq Mean Sq F value Pr(>F)
Residuals 480   3283   6.839 

我正在尝试获取名称/ p值列表,例如:

Group = 2.41
Residuals = NA
StimProb = <2e-16
StimProb:Group = 0.41
...and so forth

我能够编写提取p值的代码(如果您搜索以前的论坛,这似乎是一个普遍存在的问题):

 # p values
 p <- NA
 for(i in 1:length(amp_aov_3)){
    tmp_p  <- lapply(amp_aov_3[[i]], function(aov_sum){aov_sum$'Pr(>F)'})
    tmp_p <- unlist(tmp_p)
    p <- c(p, tmp_p)
 }

但我无法弄清楚给定效果的名称存储在何处。我只能使用names函数访问主标题(例如,'Error:subj')。有什么建议?

以下是amp_aov_3(即aov摘要)变量的转储,以便人们可以使用代码。

structure(list(`Error: subj` = structure(list(structure(list(
    Df = c(1, 58), `Sum Sq` = c(9.58542761189546, 396.251513143065
    ), `Mean Sq` = c(9.58542761189546, 6.83192264039767), `F value` = c(1.40303515078115, 
    NA), `Pr(>F)` = c(0.24104729974717, NA)), .Names = c("Df", 
"Sum Sq", "Mean Sq", "F value", "Pr(>F)"), class = c("anova", 
"data.frame"), row.names = c("Group    ", "Residuals"))), class = c("summary.aov", 
"listof")), `Error: subj:StimProb` = structure(list(structure(list(
    Df = c(1, 1, 58), `Sum Sq` = c(738.998618354635, 0.173077631291876, 
    14.5820237916664), `Mean Sq` = c(738.998618354635, 0.173077631291876, 
    0.251414203304593), `F value` = c(2939.36702318812, 0.688416282838996, 
    NA), `Pr(>F)` = c(2.16974416659602e-51, 0.410105585145585, 
    NA)), .Names = c("Df", "Sum Sq", "Mean Sq", "F value", "Pr(>F)"
), class = c("anova", "data.frame"), row.names = c("StimProb      ", 
"StimProb:Group", "Residuals     "))), class = c("summary.aov", 
"listof")), `Error: subj:StimVal` = structure(list(structure(list(
    Df = c(1, 1, 58), `Sum Sq` = c(0.12584744523128, 0.0260871639459221, 
    13.0738086981129), `Mean Sq` = c(0.12584744523128, 0.0260871639459221, 
    0.225410494795049), `F value` = c(0.558303398187847, 0.115731807295137, 
    NA), `Pr(>F)` = c(0.45796319404567, 0.734939632080671, NA
    )), .Names = c("Df", "Sum Sq", "Mean Sq", "F value", "Pr(>F)"
), class = c("anova", "data.frame"), row.names = c("StimVal      ", 
"StimVal:Group", "Residuals    "))), class = c("summary.aov", 
"listof")), `Error: subj:StimProb:StimVal` = structure(list(structure(list(
    Df = c(1, 1, 58), `Sum Sq` = c(0.255118030657232, 0.035859237207785, 
    18.0436753630829), `Mean Sq` = c(0.255118030657232, 0.035859237207785, 
    0.311097851087636), `F value` = c(0.820057193469219, 0.115266746724276, 
    NA), `Pr(>F)` = c(0.368910019571438, 0.735452223019943, NA
    )), .Names = c("Df", "Sum Sq", "Mean Sq", "F value", "Pr(>F)"
), class = c("anova", "data.frame"), row.names = c("StimProb:StimVal      ", 
"StimProb:StimVal:Group", "Residuals             "))), class = c("summary.aov", 
"listof")), `Error: Within` = structure(list(structure(list(Df = 480, 
    `Sum Sq` = 3282.85398452856, `Mean Sq` = 6.83927913443451, 
    `F value` = NA_real_, `Pr(>F)` = NA_real_), .Names = c("Df", 
"Sum Sq", "Mean Sq", "F value", "Pr(>F)"), class = c("anova", 
"data.frame"), row.names = "Residuals")), class = c("summary.aov", 
"listof"))), .Names = c("Error: subj", "Error: subj:StimProb", 
"Error: subj:StimVal", "Error: subj:StimProb:StimVal", "Error: Within"
), class = "summary.aovlist")

2 个答案:

答案 0 :(得分:1)

这是我的解决方案。此函数从R。

中的aov函数中提取p值
df.filter($"Email" match {case ".*@.*".r => true case _ => false})

答案 1 :(得分:0)

这个问题已经蛰伏了很长一段时间,但是从@Jaap所说的,我见过的最好的整洁解决方案是broom。来自here

的示例
a <- anova(lm(mpg ~ wt + qsec + disp, mtcars))
在基本函数中打印a如下所示:

a
Analysis of Variance Table

Response: mpg
          Df Sum Sq Mean Sq  F value    Pr(>F)    
wt         1 847.73  847.73 121.4366 1.082e-11 ***
qsec       1  82.86   82.86  11.8694  0.001817 ** 
disp       1   0.00    0.00   0.0001  0.990423    
Residuals 28 195.46    6.98                       
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

很高兴只是看一下,但不是一个非常整洁的方法,而且信息不易被提取。 broom始终将内容导出到数据框中,其中的名称位于各自的列中。

#from broom package
tidy(a) 
           term df        sumsq       meansq    statistic      p.value
    1        wt  1 8.477252e+02 8.477252e+02 1.214366e+02 1.081618e-11
    2      qsec  1 8.285831e+01 8.285831e+01 1.186944e+01 1.817334e-03
    3      disp  1 1.023935e-03 1.023935e-03 1.466785e-04 9.904229e-01
    4 Residuals 28 1.954626e+02 6.980807e+00           NA           NA