Question

我有数据帧mydata，如下所示：

  col1 col2
1    1    1
2    1    2
3    1    3
4    2    1
5    2    2
6    2    3

Y要在col2的组中滞后col1，所以我的预期结果将是以下情况：

  col1 col2
1    1    NA
2    1    1
3    1    2
4    2    NA
5    2    1
6    2    2

按照[这个答案] [1]尝试的程序进行操作

with_lagged_col2 = 
  mydata %>% group_by(col1) %>% arrange(col1) %>% 
  mutate(laggy = dplyr::lag(col2, n = 1, default = NA))

我真正得到的是

# A tibble: 6 x 3
# Groups:   col1 [2]
   col1  col2 laggy
  <dbl> <dbl> <dbl>
1     1     1    NA
2     1     2     1
3     1     3     2
4     2     1     3
5     2     2     1
6     2     3     2

为什么group_by在这里被忽略？

Answer 1

您不需要这种安排：

with_lagged_col2 = 
  mydata %>% group_by(col1) %>% # groups data by col1
  mutate(laggy = dplyr::lag(col2, n = 1, default = NA)) # creates new lagged variable of col1, the missing value i.e. first row is NA

dplyr中按组的滞后列

1 个答案: