使用所有可能的因子级别在输出中生成零长度()

时间:2016-06-07 21:22:43

标签: r dplyr grouping

我想总结并计算一个组内的案例数,并在没有案例的组中将输出置零。例如:

library(dplyr)

df <- structure(list(Station = c("TR1", "TR1", "TR1", "TR1", "TR1", 
                           "TR1", "TR1", "TR1", "TR2", "TR2", "TR2", "TR2", "TR2", "TR2", 
                           "TR2"), Age = c(1, 1, 1, 2, 2, 3, 4, 4, 1, 1, 1, 1, 3, 4, 4), 
               WeightTurtles = c(21, 22, 20, 43, 32, 32, 27, 32, 21, 22, 
                                 20, 15, 32, 37, 34)), class = c("tbl_df", "tbl", "data.frame"
                                 ), row.names = c(NA, -15L), .Names = c("Station", "Age", "WeightTurtles"
                                 ))

df %>%
  group_by(Station, Age) %>%
  summarise(NumTurtles=length(WeightTurtles))

结果如下:

  Station   Age NumTurtles
    (chr) (dbl)      (int)
1     TR1     1          3
2     TR1     2          2
3     TR1     3          1
4     TR1     4          2
5     TR2     1          4
6     TR2     3          1
7     TR2     4          2

我想要的是上面输出中包含的一行,如下所示:

5     TR2     2          0

那就是,如何在长度为零的因子上包含级别的出现次数(或缺少出现次数)。更一般地说,如何告诉R使用所有可能的因子水平来计算长度?

2 个答案:

答案 0 :(得分:2)

您可以使用complete中的tidyr功能执行此操作。 complete为缺失的组添加一行,并为该行的NA值填充WeightTurtles(除非您选择不同的填充值):

library(dplyr)
library(tidyr)

df %>%
  complete(Age, nesting(Station)) %>%
  group_by(Station, Age) %>%
  summarise(NumTurtles=sum(!is.na(WeightTurtles)))
  Station   Age NumTurtles
1     TR1     1          3
2     TR1     2          2
3     TR1     3          1
4     TR1     4          2
5     TR2     1          4
6     TR2     2          0
7     TR2     3          1
8     TR2     4          2

答案 1 :(得分:0)

以下是dplyr我能想到的一个解决方案:

library(dplyr)
df <- left_join(expand.grid(Station = unique(df$Station),
                            Age = unique(df$Age), stringsAsFactors = FALSE),
                df)
df %>%
  group_by(Station, Age) %>%
  summarise(NumTurtles = sum(!is.na(WeightTurtles)))

Source: local data frame [8 x 3]
Groups: Station [?]

  Station   Age NumTurtles
    <chr> <dbl>      <int>
1     TR1     1          3
2     TR1     2          2
3     TR1     3          1
4     TR1     4          2
5     TR2     1          4
6     TR2     2          0
7     TR2     3          1
8     TR2     4          2