我有以下数据集
Amount1 Amount2 Date Group
1 NA 350 2019-01-01 A
2 NA 335 2019-01-01 B
3 NA 340 2019-01-01 C
4 300 365 2019-01-06 A
5 310 325 2019-01-06 B
6 285 355 2019-01-06 C
7 310 335 2019-01-11 A
8 305 355 2019-01-11 B
9 335 360 2019-01-11 C
10 280 NA 2019-01-16 A
11 290 NA 2019-01-16 B
12 240 NA 2019-01-16 C
您可以用它重新创建
> dput(test)
structure(list(Amount1 = c(NA, NA, NA, 300, 310, 285, 310, 305, 335, 280, 290, 240),
Amount2 = c(350, 335, 340, 365, 325, 355, 335, 355, 360, NA, NA, NA),
Date = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L), .Label = c("2019-01-01", "2019-01-06", "2019-01-11", "2019-01-16"), class = "factor"),
Group = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("A", "B", "C"), class = "factor")),
row.names = c(NA, -12L), class = "data.frame")
我希望每个组从上一个Amount1
中减去Amount2
。
例如,对于A组,我有:
2019-01-01 -> NA
2019-01-06 -> 350 - 300 = 50
2019-01-11 -> 365 - 310 = 55
2019-01-16 -> 335 - 280 = 55
我将如何去做?我尝试使用mutate_at
,但没有成功...
# Does not work...
test %>%
group_by(Group, Amount2) %>%
mutate_at(c("Amount1"), funs(AmountDiff = . - lag(Amount2, 1)))
答案 0 :(得分:1)
怎么样?
test %>%
group_by(Group) %>%
mutate(Amount_diff = lag(Amount2) - Amount1)
哪个是
A tibble: 12 x 5
# Groups: Group [3]
Amount1 Amount2 Date Group Amount_diff
<dbl> <dbl> <fct> <fct> <dbl>
1 NA 350 2019-01-01 A NA
2 NA 335 2019-01-01 B NA
3 NA 340 2019-01-01 C NA
4 300 365 2019-01-06 A 50
5 310 325 2019-01-06 B 25
6 285 355 2019-01-06 C 55
7 310 335 2019-01-11 A 55
8 305 355 2019-01-11 B 20
9 335 360 2019-01-11 C 20
10 280 NA 2019-01-16 A 55
11 290 NA 2019-01-16 B 65
12 240 NA 2019-01-16 C 120
对于A组:
test %>%
group_by(Group) %>%
mutate(Amount_diff = lag(Amount2) - Amount1) %>%
filter(Group == "A")
是:
# A tibble: 4 x 5
# Groups: Group [1]
Amount1 Amount2 Date Group Amount_diff
<dbl> <dbl> <fct> <fct> <dbl>
1 NA 350 2019-01-01 A NA
2 300 365 2019-01-06 A 50
3 310 335 2019-01-11 A 55
4 280 NA 2019-01-16 A 55