与dplyr的分组滞后差异

时间:2019-06-06 09:14:30

标签: dplyr lag difference

我有以下数据集

   Amount1 Amount2       Date Group
1       NA     350 2019-01-01     A
2       NA     335 2019-01-01     B
3       NA     340 2019-01-01     C
4      300     365 2019-01-06     A
5      310     325 2019-01-06     B
6      285     355 2019-01-06     C
7      310     335 2019-01-11     A
8      305     355 2019-01-11     B
9      335     360 2019-01-11     C
10     280      NA 2019-01-16     A
11     290      NA 2019-01-16     B
12     240      NA 2019-01-16     C

您可以用它重新创建

> dput(test) 

structure(list(Amount1 = c(NA, NA, NA, 300, 310, 285, 310, 305, 335, 280, 290, 240), 
Amount2 = c(350, 335, 340, 365, 325, 355,  335, 355, 360, NA, NA, NA), 
Date = structure(c(1L, 1L, 1L, 2L,  2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L), .Label = c("2019-01-01", "2019-01-06",  "2019-01-11", "2019-01-16"), class = "factor"), 
Group = structure(c(1L,  2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("A",  "B", "C"), class = "factor")), 
row.names = c(NA, -12L), class = "data.frame")

我希望每个组从上一个Amount1中减去Amount2

例如,对于A组,我有:

2019-01-01 -> NA
2019-01-06 -> 350 - 300 = 50
2019-01-11 -> 365 - 310 = 55
2019-01-16 -> 335 - 280 = 55

我将如何去做?我尝试使用mutate_at,但没有成功...

# Does not work...
test %>%
  group_by(Group, Amount2) %>%
  mutate_at(c("Amount1"), funs(AmountDiff = . - lag(Amount2, 1)))

1 个答案:

答案 0 :(得分:1)

怎么样?

test %>% 
  group_by(Group) %>% 
  mutate(Amount_diff = lag(Amount2) - Amount1)

哪个是

A tibble: 12 x 5
# Groups:   Group [3]
   Amount1 Amount2 Date       Group Amount_diff
     <dbl>   <dbl> <fct>      <fct>       <dbl>
 1      NA     350 2019-01-01 A              NA
 2      NA     335 2019-01-01 B              NA
 3      NA     340 2019-01-01 C              NA
 4     300     365 2019-01-06 A              50
 5     310     325 2019-01-06 B              25
 6     285     355 2019-01-06 C              55
 7     310     335 2019-01-11 A              55
 8     305     355 2019-01-11 B              20
 9     335     360 2019-01-11 C              20
10     280      NA 2019-01-16 A              55
11     290      NA 2019-01-16 B              65
12     240      NA 2019-01-16 C             120

对于A组:

test %>% 
  group_by(Group) %>% 
  mutate(Amount_diff = lag(Amount2) - Amount1) %>% 
  filter(Group == "A")

是:

# A tibble: 4 x 5
# Groups:   Group [1]
  Amount1 Amount2 Date       Group Amount_diff
    <dbl>   <dbl> <fct>      <fct>       <dbl>
1      NA     350 2019-01-01 A              NA
2     300     365 2019-01-06 A              50
3     310     335 2019-01-11 A              55
4     280      NA 2019-01-16 A              55