我可以通过mutate从一个组创建多个列吗?

时间:2017-07-28 03:33:20

标签: r dplyr

我想将数据框分组到某个列上,然后将函数应用于返回多个列的分组数据。举例来说,请考虑以下内容

Names = append(rep('Mark',10),rep('Joe',10))
Spend = rnorm(length(Names),50,0.5)

df <- data.frame(
  Names,
  Spend
)


get.mm <- function(data){


  return(list(median(data),mean(data)))
}

此处,get.mm返回两个数字的列表。我想将get.mm应用于df %>% group_by(Names),并且结果有两列,每个函数输出一列。

期望的结果应该是

  Names   median    mean
  <fctr>    <dbl>   <dbl>
1    Joe 49.89284 49.9504
2   Mark 50.17244 50.0735

我已经简化了这里的演示功能,我知道我可以做一些像

这样的事情
df %>% group_by(Names) %>% summarise(median = median(Spend), mean = mean(Spend))

1 个答案:

答案 0 :(得分:1)

如果您重写get.mm以便它返回数据框,那么您可以使用group_by %>% do

get.mm <- function(data){
    data.frame(median = median(data), mean = mean(data))
}

df %>% group_by(Names) %>% do(get.mm(.$Spend))  
# here . stands for a sub data frame with a unique Name, .$Spend passes the corresponding
# column to the function

可重现的例子:

set.seed(1)
Names = append(rep('Mark',10),rep('Joe',10))
Spend = rnorm(length(Names),50,0.5)
df <- data.frame(Names, Spend)

df %>% group_by(Names) %>% do(get.mm(.$Spend))

# A tibble: 2 x 3
# Groups:   Names [2]
#   Names   median     mean
#  <fctr>    <dbl>    <dbl>
#1    Joe 50.24594 50.12442
#2   Mark 50.12829 50.06610

df %>% group_by(Names) %>% summarise(median = median(Spend), mean = mean(Spend))

# A tibble: 2 x 3
#   Names   median     mean
#  <fctr>    <dbl>    <dbl>
#1    Joe 50.24594 50.12442
#2   Mark 50.12829 50.06610