我的数据设置如下例所示:
Name df Value
A 1 .5
A 2 2
A 3 3
B 1 1
B 2 .5
我想获得两个值之间的差异,直到“名称”列更改,然后我希望它停止并开始获得新的差异。如下所示:
Name df Value Diff
A 1 .5 NA
A 2 2 1.5
A 3 3 2.5
B 1 1 NA
B 2 .5 -.5
有什么办法可以做到这一点?我曾尝试将数据集格式化为宽格式,但我也找不到一种使该数据集有效的方法。
答案 0 :(得分:3)
一种选择是按diff
分组
library(dplyr)
df1 %>%
group_by(Name) %>%
mutate(Diff = c(NA, cumsum(diff(Value))))
# A tibble: 5 x 4
# Groups: Name [2]
# Name df Value Diff
# <chr> <int> <dbl> <dbl>
#1 A 1 0.5 NA
#2 A 2 2 1.5
#3 A 3 3 2.5
#4 B 1 1 NA
#5 B 2 0.5 -0.5
df1 <- structure(list(Name = c("A", "A", "A", "B", "B"), df = c(1L,
2L, 3L, 1L, 2L), Value = c(0.5, 2, 3, 1, 0.5)),
class = "data.frame", row.names = c(NA,
-5L))
答案 1 :(得分:2)
@akrun 答案是可行的方法,但就像一个谜一样,这也可行:
df1 %>%
group_by(Name) %>%
mutate(Diff = cumsum(Value - lag(Value, default = Value[1])))
# # A tibble: 5 x 4
# # Groups: Name [2]
# Name df Value Diff
# <chr> <int> <dbl> <dbl>
# 1 A 1 0.5 0
# 2 A 2 2 1.5
# 3 A 3 3 2.5
# 4 B 1 1 0
# 5 B 2 0.5 -0.5