tidyverse:每个因素水平的方差分析

时间:2018-10-02 16:55:06

标签: r dplyr purrr anova

我想对因子的每个水平执行方差分析。我可以用dplyr::do来做,但是也想用purrr来做。请提供任何提示。

library(tidyverse)

df1 <- mtcars
df1$cyl  <- factor(df1$cyl)
df1$gear <- factor(df1$gear)

fm1 <-
  df1 %>%
  dplyr::group_by(gear) %>%
  dplyr::do(m1 = summary(aov(mpg ~ cyl, data = .)))

fm1$m1

> fm1$m1
[[1]]
            Df Sum Sq Mean Sq F value Pr(>F)  
cyl          2  69.03   34.52   4.596  0.033 *
Residuals   12  90.11    7.51                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

[[2]]
            Df Sum Sq Mean Sq F value Pr(>F)  
cyl          1  137.3   137.3   8.123 0.0172 *
Residuals   10  169.0    16.9                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

[[3]]
            Df Sum Sq Mean Sq F value Pr(>F)  
cyl          2  167.4   83.68   16.74 0.0564 .
Residuals    2   10.0    5.00                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

fm2 <-
  df1 %>%
  dplyr::group_by(gear) %>%
  tidyr::nest() %>%
  dplyr::mutate(m2 = purrr::map(.x = data, .f = ~ summary(aov(mpg ~ cyl, data = .)))) %>%
  tidyr::unnest()

1 个答案:

答案 0 :(得分:3)

您可以使用嵌套的数据框,然后将所有摘要保存在新的列表列中:

library(tidyverse)

df1 <- mtcars

df_aov <- df1 %>%
  dplyr::group_by(gear) %>%
  tidyr::nest() %>%
  dplyr::mutate(.data = .,
                aov_results = data %>% purrr::map(.x = ., .f = ~ summary(aov(mpg ~ cyl, data = .))))

df_aov$aov_results[[1]]
#>             Df Sum Sq Mean Sq F value Pr(>F)  
#> cyl          1  137.3   137.3   8.123 0.0172 *
#> Residuals   10  169.0    16.9                 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

reprex package(v0.2.1)于2018-10-02创建