Adding IDs based on groups

时间:2019-04-08 13:39:09

标签: r dplyr

I'm trying to find a solution using tidyverse syntax to solve the following issue.

Given a table of video frame types and sizes, I'd like to group them into "Groups of Pictures". A GOP starts whenever an I-frame occurs. So, given this data:

d.test <- tribble(
  ~frame_type, ~frame_size,
  "I", 960,
  "B", 23,
  "P", 48,
  "B", 58,
  "I", 501,
  "B", 23,
  "P", 48,
  "B", 58
)

I would like the first four frames to belong to group 1, the next four to group 2, etc. Groups can be of different lengths, so just using a trick with modulo four is not enough.

I first tried this:

d.test %>% 
    mutate(group = group_indices(., frame_type))

… but it repeats the group index over the entire dataframe.

I also tried the my_rleid function suggested here:

my_rleid <- function(x) {
  1L + cumsum(dplyr::coalesce(x != dplyr::lag(x), FALSE))
}

… but this includes all frame types, so returns group IDs of 1, 2, 3, 4, and so on.

1 个答案:

答案 0 :(得分:0)

I could get it working by calculating the cumulative sum of a comparison of the frame type against I:

d.test %>% 
  mutate(group = cumsum(frame_type == "I"))

This returns:

  frame_type frame_size group
  <chr>           <dbl> <int>
1 I                 960     1
2 B                  23     1
3 P                  48     1
4 B                  58     1
5 I                 501     2
6 B                  23     2
7 P                  48     2
8 B                  58     2