dplyr中具有条件的多个组的汇总计数

时间:2017-06-02 23:40:45

标签: r dplyr

我有一个如下所示的数据框:

data <- data.frame(a=c(1,1,0,0,0,0,1,1,1, 0), 
               b=c("x","x","x","x","x","y","y","y","z","z"),
               c=c(2, 1, 2, 3, 4, NA, 4, 2, 1, 1), 
               d= c("s", "m", "l", "l", "l", "m", "m", "s", "s", "m"))

我想找到一种方法来创建一个新变量e,它是c中值的总和,当a = 1时,对于d和b的每个组合。我尝试了几个没有给我正在寻找的选项,例如:

data <- data %>% 
    group_by(d, b) %>% 
    summarise (e = sum(data$c[which(data$a=="x")]))

最终看起来像:

       d      b     e
1      s      x     2
2      m      x     1
3      l      x     9
4      m      y     4
5      s      y     2
6      s      z     1
7      s      z     1

但不幸的是,我只是得到一个恒定的e?任何帮助表示赞赏!

2 个答案:

答案 0 :(得分:2)

library(dplyr)

data <- data_frame(
  a=c(1,1,0,0,0,0,1,1,1, 0), 
                   b=c("x","x","x","x","x","y","y","y","z","z"),
                   c=c(2, 1, 2, 3, 4, NA, 4, 2, 1, 1), 
                   d= c("s", "m", "l", "l", "l", "m", "m", "s", "s", "m"))

data
#> # A tibble: 10 x 4
#>        a     b     c     d
#>    <dbl> <chr> <dbl> <chr>
#>  1     1     x     2     s
#>  2     1     x     1     m
#>  3     0     x     2     l
#>  4     0     x     3     l
#>  5     0     x     4     l
#>  6     0     y    NA     m
#>  7     1     y     4     m
#>  8     1     y     2     s
#>  9     1     z     1     s
#> 10     0     z     1     m

data %>% 
  group_by(d, b) %>% 
  mutate(e = if_else(a == 1, c, 0)) %>% 
  summarise(e = sum(e, na.rm = TRUE))

#> Source: local data frame [7 x 3]
#> Groups: d [?]
#> 
#> # A tibble: 7 x 3
#>       d     b     e
#>   <chr> <chr> <dbl>
#> 1     l     x     0
#> 2     m     x     1
#> 3     m     y     4
#> 4     m     z     0
#> 5     s     x     2
#> 6     s     y     2
#> 7     s     z     1

如果您愿意,您也可以在总结电话中完成所有操作:

summarise(e = if_else(a == 1, c, 0) %>% sum(na.rm = TRUE))

答案 1 :(得分:0)

我们可以使用

library(dplyr) 
data %>%
     group_by(d, b) %>% 
     summarise(e = sum(c[a==1], na.rm = TRUE))

data %>%
      group_by(d, b) %>% 
      summarise(e = sum((a==1)*c, na.rm = TRUE))