broom ::整理不使用purrr :: map_dfr

时间:2018-04-05 03:39:01

标签: r tidyverse purrr broom

我正在尝试创建一个数据框,其中包含来自broom包的整齐结果,用于Wilcox的测试。我已经能够为所有分组变量编写运行此测试的代码,并创建一个包含这些测试结果的列表列。现在我想使用purrr来整理每个测试结果的结果并将它们合并到一个数据框中,但这似乎不起作用,我不知道为什么。

这是一个完全可重现的例子:

library(tidyverse)

# converting iris dataframe to long format
iris_long <- datasets::iris %>%
  dplyr::mutate(.data = ., id = dplyr::row_number(x = Species)) %>%
  tidyr::gather(
    data = .,
    key = "condition",
    value = "value",
    Sepal.Length:Petal.Width,
    convert = TRUE,
    factor_key = TRUE
  ) %>%
  tidyr::separate(
    col = "condition",
    into = c("part", "measure"),
    sep = "\\.",
    convert = TRUE
  ) %>%
  tibble::as_data_frame(x = .)

# running Wilcox test on each level of factors Species and measure
results_df <- iris_long %>%
  mutate_if(.tbl = ., .predicate = is.character, .funs = as.factor) %>%
  dplyr::group_by(.data = ., Species, measure) %>%
  tidyr::nest(data = .) %>% # running two-sample Wilcoxon tests on each individual group with purrr
  dplyr::mutate(results = data %>% purrr::map(
    .x = .,
    .f = ~ stats::wilcox.test(
      formula = value ~ part,
      mu = 0,
      alternative = "two.sided",
      conf.level = 0.95,
      na.action = na.omit,
      conf.int = TRUE,
      data = (.)
    )
  )
  ) %>%
  dplyr::select(.data = ., results)

# check the newly created list column containing results from 6 combinations
results_df
#> # A tibble: 6 x 1
#>   results    
#>   <list>     
#> 1 <S3: htest>
#> 2 <S3: htest>
#> 3 <S3: htest>
#> 4 <S3: htest>
#> 5 <S3: htest>
#> 6 <S3: htest>
# so the function was executed for all groups

# check tidied results for first group
broom::tidy(x = results_df$results[[1]])
#>    estimate statistic      p.value conf.low conf.high
#> 1 -3.500078         0 5.515865e-18 -3.60004 -3.400007
#>                                              method alternative
#> 1 Wilcoxon rank sum test with continuity correction   two.sided

# creating a dataframe by tidying results from all results in results_df list
purrr::map_dfr(.x = results_df,
               .f = ~ broom::tidy(x = .),
               .id = "group")
#> Warning in is.na(x): is.na() applied to non-(list or vector) of type 'NULL'
#> Error in names(object) <- nm: 'names' attribute [1] must be the same length as the vector [0]

reprex package(v0.2.0)创建于2018-04-04。

1 个答案:

答案 0 :(得分:1)

您需要指定:

.x = results_df$results

如果您对其他方法感兴趣,可以使用拆分来缩短代码。

iris_long %>% 
  split(list(.$Species, .$measure)) %>% 
  map_dfr(~wilcox.test(value ~ part, 
                       na.action = na.omit, 
                       conf.int = TRUE, 
                       data = .x) %>% broom::tidy())