Question

这是我的数据

as_tibble(data)
# A tibble: 40 x 4
   Trt        V1      V2      V3
   <fct>   <dbl>   <dbl>   <dbl>
 1 d1    0.0105  0.00940 0.0174 
 2 d1    0.0199  0.00897 0.00279
 3 d1    0.00836 0.0104  0.00816
 4 d1    0.00960 0.0131  0.00404
 5 d1    0.00527 0.0123  0.00863
 6 d1    0.0136  0.0115  0.0130 
 7 d1    0.0216  0.00591 0.0106 
 8 d1    0.00558 0.00890 0.00964
 9 d2    0.0193  0.0116  0.0199 
10 d2    0.0172  0.0165  0.0582 
# ... with 30 more rows

我想使用aov和V*执行Trt，然后执行函数f2中给出的其他统计信息

f2 <- function(y, Trt){

  dt1 <- aov(y ~ Trt) %>%
    emmeans(specs = "Trt")

  dt2 <- coef(pairs(dt1)) %>%
    select(2:5)

  d3 <- contrast(dt1, dt2, adjust = "Dunnett") %>%
    summary %>%
    pull(p.value)

 return(d3)
}

当我一次对V*运行一列Trt时，我得到了预期的结果

f2(data$V1, data$Trt)
[1] 5.450331e-01 5.936861e-01 2.302477e-02 7.882583e-15

f2(data$V2, data$Trt)
[1] 5.217088e-01 1.722111e-01 4.030167e-05 4.439782e-13

我想将f2应用于所有以V*开头的列。这段代码给出了错误

map2_dfr(data %>% select_if(is.double), data$Trt, f2)
Error: Mapped vectors must have consistent lengths:
* `.x` has length 3
* `.y` has length 40

我不知道为什么map2_dfr一次不能选择一列。有帮助吗？

Answer 1

我会做这样的事情。首先，我加载一些程序包并创建一些与您的结构相同的随机数据。

library(dplyr, warn.conflicts = FALSE)
library(tidyr)
library(purrr)
library(emmeans)

data <- tibble::tibble(
  Trt = factor(rep(c("A", "B", "C", "D", "E"), each = 8)), 
  V1 = rnorm(40), 
  V2 = rnorm(40), 
  V3 = rnorm(40)
)

我稍微修改了f2的定义。现在，它接受一个数据框和一个表示aov公式的字符表达式作为输入。

f2 <- function(data, aov_formula){

  dt1 <- aov(as.formula(aov_formula), data) %>%
    emmeans(specs = "Trt")

  dt2 <- coef(pairs(dt1)) %>%
    select(2:5)

  d3 <- contrast(dt1, dt2, adjust = "Dunnett") %>%
    summary %>%
    pull(p.value)

  d3
}

现在，我（使用gather“整理”您的数据，如下所示：

data <- data %>% 
  gather("index", "y", -Trt)
data
#> # A tibble: 120 x 3
#>    Trt   index       y
#>    <fct> <chr>   <dbl>
#>  1 A     V1     0.347 
#>  2 A     V1    -0.0837
#>  3 A     V1     0.389 
#>  4 A     V1     0.0358
#>  5 A     V1    -1.45  
#>  6 A     V1     0.0621
#>  7 A     V1     0.449 
#>  8 A     V1    -1.32  
#>  9 B     V1    -0.946 
#> 10 B     V1    -0.0518
#> # ... with 110 more rows

，以便现在我可以使用嵌套/映射方法将功能f2应用于每个V *变量。

data %>% 
  nest(-index) %>% 
  mutate(res = map(data, f2, aov_formula = "y ~ Trt")) %>% 
  unnest(res)
#> # A tibble: 12 x 2
#>    index   res
#>    <chr> <dbl>
#>  1 V1    0.996
#>  2 V1    0.986
#>  3 V1    0.781
#>  4 V1    0.721
#>  5 V2    1.000
#>  6 V2    0.798
#>  7 V2    0.965
#>  8 V2    1.000
#>  9 V3    0.949
#> 10 V3    0.551
#> 11 V3    0.546
#> 12 V3    0.670

^{由reprex package（v0.3.0）于2019-07-23创建}

如果您不喜欢所得数据框的形状，则可以使用“收集和展开”来对其进行整形。

将功能应用到所有列，其中一列使用map

1 个答案: