如何在函数中将Lapply与dplyr结合

时间:2019-10-20 06:03:25

标签: r function dplyr lapply plyr

下面是我创建的示例数据框以及预期的输出。

df = data.frame(color = c("Yellow", "Blue", "Green", "Red", "Magenta"),
                values = c(24, 24, 34, 45, 49),
                Quarter = c("Period1","Period2" , "Period3", "Period3", "Period1"),
                Market = c("Camden", "StreetA", "DansFireplace", "StreetA", "DansFireplace"))


dfXQuarter = df %>% group_by(Quarter) %>% summarise(values = sum(values)) %>%
  mutate(cut = "Quarter") %>% data.frame()

colnames(dfXQuarter)[1] = "Grouping"

dfXMarket = df %>% group_by(Market) %>% summarise(values = sum(values)) %>% 
  mutate(cut = "Market")%>% data.frame()
colnames(dfXMarket)[1] = "Grouping"


df_all = rbind(dfXQuarter, dfXMarket)

为了简洁起见,现在我要将其编译为一个函数并使用lapply。 以下是我的相同尝试-

list = c("Market", "Quarter")


df_all <- do.call(rbind, lapply(list, function(x){
  df_l= df %>% group_by(x) %>% 
    summarise(values = sum(values)) %>% 
    mutate(cut= x) %>% 
    data.frame()
   colnames(df_l)[df_l$x] = "Grouping"
  df_l
}))

这段代码给了我错误。

我需要输出为'df_all'输出的精确副本,以进行进一步的操作。

我如何正确编写此功能?

2 个答案:

答案 0 :(得分:3)

我们可以使用purrr::map_dfr

library(dplyr)
library(purrr) 
#Don't use the R build-in type e.g. list in variables name 
lst <- c("Market", "Quarter")
#Use map if you need the output as a list
map_dfr(lst, ~df %>% group_by("Grouping"=!!sym(.x)) %>% 
                                   summarise(values = sum(values)) %>%
                                   mutate(cut = .x) %>% 
                                   #To avoid the warning massage from bind_rows
                                   mutate_if(is.factor, as.character))

# A tibble: 6 x 3
  Grouping      values cut    
  <chr>          <dbl> <chr>  
1 Camden            24 Market 
2 DansFireplace     83 Market 
3 StreetA           69 Market 
4 Period1           73 Quarter
5 Period2           24 Quarter
6 Period3           79 Quarter

我们可以通过以下方式解决第一个解决方案

  1. group_by(x)更改为group_by_at(x),因为x是此处的字符串。
  2. 使用colnames(df_l)[colnames(df_l)==x] <- "Grouping"命名分组变量。

答案 1 :(得分:0)

不太漂亮,但可以工作,不需要整洁的功能:

groupwise_summation <- function(df, grouping_vecs){


  # Split, apply, combine: 

  tmpdf <- do.call(rbind, lapply(split(df, df[,grouping_vecs]), function(x){sum(x$values)}))

  # Clean up the df: 

  data.frame(cbind(cut = row.names(tmpdf), value = as.numeric(tmpdf)), row.names = NULL)


}


# Apply and combine:

df_all <- rbind(groupwise_summation(df, c("Quarter")), groupwise_summation(df, c("Market")))


# Note inside the c(), you can use multiple grouping variables.