从purrr

时间:2018-10-30 14:58:35

标签: r function dplyr tidyr purrr

我创建了一个函数,可以对数据集中的案例子集执行多次荟萃分析。我的目标是从由特定函数创建的对象中提取特定索引。

library(magrittr)
library(dplyr)
library(purrr)
library(tidyr)
library(meta)

df <- "Scale ID   tau tau_SE
         1  1  0.41   0.17
         1  2 -0.09   0.19
         1  3  0.11   0.24
         2  5  0.78   0.26
         2  8  0.76   0.24
         2  9  0.23   0.17
         3  1  0.21   0.17
         3 12  0.16   0.16
         3 13  0.20   0.25"

df <- read.table(text = df, header = TRUE)

df %>% 
  group_by(Scale) %>% 
  nest() %>% 
  mutate(m = map(data, function(d) metagen(d$tau, d$tau_SE)))
#> # A tibble: 3 x 3
#>   Scale data             m            
#>   <int> <list>           <list>       
#> 1     1 <tibble [3 x 3]> <S3: metagen>
#> 2     2 <tibble [3 x 3]> <S3: metagen>
#> 3     3 <tibble [3 x 3]> <S3: metagen>

如您所见,我按比例将数据分组,然后应用了meta::metagenpurrr::map函数。 metagen对象是一组索引。我想提取其中的一个子集。您可以在下面找到列表。

fits <- c("k", "TE.fixed", "lower.fixed", "upper.fixed", "zval.fixed", "pval.fixed", "tau", "H", "I2", "Q", "df.Q", "pval.Q")

您能帮我写我开始的代码吗?理想情况下,我想通过purrr进行操作,以使代码设计一致。

更新 遵循Camille的建议,我可以提取所需的索引。不幸的是,当我取消嵌套数据时,变量未正确标记,并且总的来说非常混乱,因为列没有跨不同的比例进行配对。这可能是一个非常愚蠢的问题,但我自己无法解决。

Enablers %>% 
   group_by(Scale) %>% 
   nest() %>% 
   mutate(m = map(data, function(d) metagen(d$tau, d$tau_SE)),
          fitM = m %>% map(function(fit) c(fit$k, fit$TE.fixed, fit$lower.fixed, fit$upper.fixed, fit$zval.fixed, fit$pval.fixed, fit$tau, fit$H, fit$I2, fit$Q, fit$df.Q, fit$pval.Q))) %>% 
   mutate(fitM = invoke_map(tibble, fitM)) %>% 
   unnest(fitM)

# A tibble: 3 x 38
# Scale data  m       `4` `0.230417034444~ `0.036674086272~ `0.424159982616~ `2.330970459557~ `0.019754917401~ `0.171369905853~ `1.317529830742~
#  <dbl> <lis> <lis> <dbl>            <dbl>            <dbl>            <dbl>            <dbl>            <dbl>            <dbl>            <dbl>
#1     1 <tib~ <S3:~     4            0.230           0.0367            0.424             2.33           0.0198            0.171             1.32
#2     2 <tib~ <S3:~     4           NA              NA                NA                NA             NA                NA                NA   
#3     3 <tib~ <S3:~    NA           NA              NA                NA                NA             NA                NA                NA   
# ... with 27 more variables: `0.423924923834129` <dbl>, `5.20765456469115` <dbl>, `3` <dbl>, `0.157208039476882` <dbl>, `5` <dbl>,
#   `0.479867456084876` <dbl>, `0.271159257236615` <dbl>, `0.688575654933137` <dbl>, `4.50640145652835` <dbl>, `6.59362853436493e-06` <dbl>,
#   `0.185286333523807` <dbl>, `1.25125870702537` <dbl>, `0.361286971763421` <dbl>, `6.26259340762719` <dbl>, `0.180377144245142` <dbl>,
#   `8` <dbl>, `0.32250031966557` <dbl>, `0.171296573142346` <dbl>, `0.473704066188793` <dbl>, `4.18037929668686` <dbl>,
#   `2.91023257890799e-05` <dbl>, `0.0517056311225353` <dbl>, `1.02682517258907` <dbl>, `0.0515662797795363` <dbl>, `7.38058954543798` <dbl>,
#   `7` <dbl>, `0.39035653653684` <dbl>

1 个答案:

答案 0 :(得分:0)

在研究了代码之后,我能够提出以下解决方案。也许会有更好的选择(更优雅),但至少这一项是可行的。替代解决方案将非常受欢迎!

Enablers %>% 
  group_by(Scale) %>% 
  nest() %>% 
  mutate(m = map(data, function(d) metagen(d$tau, d$tau_SE)),
         fitM = m %>% map(function(fit) c(fit$k, fit$TE.fixed, fit$seTE.fixed, fit$lower.fixed, fit$upper.fixed, fit$zval.fixed, fit$pval.fixed, fit$tau, fit$H, fit$I2, fit$Q, fit$df.Q, fit$pval.Q))) %>%
  unnest(fitM) %>%
  group_by(Scale) %>%
  mutate(names = c("N","Est", "SE", "CI_lower","CI_upper","z", "p", "tau", "H", "I2", "Q", "Q_df", "Q_p")) %>%
  ungroup() %>% 
  as.data.frame() %>% 
  spread(., names, fitM)