将dplyr函数中包含group_by的函数应用于R中的数据列表

时间:2019-04-23 16:16:18

标签: r function dplyr lapply datalist

我有一个data.list像这样:

list(structure(list(group = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L
), species = structure(c(3L, 3L, 1L, 3L, 3L, 2L, 3L, 1L, 3L, 
1L, 3L, 1L, 3L, 1L, 2L, 4L, 1L, 4L, 2L, 3L, 3L, 3L, 2L, 2L), .Label = 
c("Apiaceae", 
"Ceyperaceae", "Magnoliaceae", "Vitaceae"), class = "factor"), 
N = c(2L, 2L, 3L, 2L, 2L, 1L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 
3L, 1L, 4L, 3L, 4L, 1L, 2L, 2L, 2L, 1L, 1L)), class = "data.frame", 
row.names = c(NA, 
-24L)), structure(list(group = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L), species = structure(c(3L, 3L, 1L, 3L, 3L, 2L, 3L, 1L, 3L, 
1L, 3L, 1L, 3L, 1L, 2L, 4L, 1L, 4L, 2L, 3L, 3L, 3L, 2L, 2L), .Label = 
c("Apiaceae", 
"Ceyperaceae", "Magnoliaceae", "Vitaceae"), class = "factor"), 
N = c(2L, 2L, 3L, 2L, 2L, 1L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 
3L, 1L, 4L, 3L, 4L, 1L, 2L, 2L, 2L, 1L, 1L)), class = "data.frame", 
row.names = c(NA, 
-24L)))

我想将在dplyr软件包中编写的my.fun应用于此数据列表。首先,我按“组”对数据进行分组,并获得R中已经存在的函数的输出,然后将该函数应用于数据列表。但是输出为0。没有任何输出。您能帮我找出错误吗?

 my.fun <- function(x, y){
    group_by(x, !!as.name(group)) %>%
    mutate(out = diversity(N, "shannon")) 
 }

check <- lapply(colnames(list), function(x) {
  my.fun(x$group, x$N)
}) 

非常感谢!

1 个答案:

答案 0 :(得分:2)

假设我们要传递组列和将diversity用作字符串的列,

library(tidyverse)
library(vegan)
my.fun <- function(data, grpCol, divCol) {
       data %>% 
           group_by_at(grpCol) %>%
           mutate(out = diversity(!! rlang::sym(divCol), "shannon"))
           #or use mutate_at
           # mutate_at(vars(divCol), list(out = ~ diversity(., "shannon")))
    }

map(lst1, my.fun, grpCol = "group", divCol = "N")
#[[1]]
# A tibble: 24 x 4
# Groups:   group [3]
#   group species          N   out
#   <int> <fct>        <int> <dbl>
# 1     1 Magnoliaceae     2  1.75
# 2     1 Magnoliaceae     2  1.75
# 3     1 Apiaceae         3  1.75
# 4     1 Magnoliaceae     2  1.75
# 5     1 Magnoliaceae     2  1.75
# 6     1 Ceyperaceae      1  1.75
# 7     2 Magnoliaceae     2  2.06
# 8     2 Apiaceae         3  2.06
# 9     2 Magnoliaceae     2  2.06
#10     2 Apiaceae         3  2.06
# … with 14 more rows

#[[2]]
# A tibble: 24 x 4
# Groups:   group [3]
#   group species          N   out
#   <int> <fct>        <int> <dbl>
# 1     1 Magnoliaceae     2  1.75
# 2     1 Magnoliaceae     2  1.75
# 3     1 Apiaceae         3  1.75
# 4     1 Magnoliaceae     2  1.75
# 5     1 Magnoliaceae     2  1.75
# 6     1 Ceyperaceae      1  1.75
# 7     2 Magnoliaceae     2  2.06
# 8     2 Apiaceae         3  2.06
# 9     2 Magnoliaceae     2  2.06
#10     2 Apiaceae         3  2.06
# … with 14 more rows

请注意

identical(lst1[[1]], lst1[[2]])
#[1] TRUE