分组依据和条件汇总

时间:2019-06-11 15:51:14

标签: r dataframe dplyr

我有数据帧df。在group_by(id, Year, Month, new_used_ind)summarise(n = n())之后,看起来像:

id  Year   Month  new_used_ind   n
1   2001   apr     N             3
1   2001   apr     U             2
2   2002   mar     N             5
3   2003   mar     U             3
4   2004   july    N             4          
4   2004   july    U             2

我想添加并获取ID,年和月的总计,但我还想从new_used_ind的新列中总计'N'。

类似这样的东西

id  Year   Month  Total_New   total
1   2001   apr     3            5
2   2002   mar     5            8
4   2004   july    4            6

1 个答案:

答案 0 :(得分:1)

library(dplyr)

read.table(text= "id  Year   Month  new_used_ind   n
1   2001   apr     N             3
1   2001   apr     U             2
2   2002   mar     N             5
3   2003   mar     U             3
4   2004   july    N             4          
4   2004   july    U             2", header = T) -> df

df %>%
  group_by(id, Year, Month) %>%
  mutate(total_New=sum(n*(new_used_ind=="N"))) %>% 
  mutate(total_n=sum(n)) %>% 
  summarise_at(c("total_New", "total_n"), mean)

#> # A tibble: 4 x 5
#> # Groups:   id, Year [4]
#>      id  Year Month total_New total_n
#>   <int> <int> <fct>     <dbl>   <dbl>
#> 1     1  2001 apr           3       5
#> 2     2  2002 mar           5       5
#> 3     3  2003 mar           0       3
#> 4     4  2004 july          4       6

reprex package(v0.3.0)于2019-06-11创建