分组中的值之间的差异

时间:2019-12-12 15:14:39

标签: r dplyr

嗨,我需要实现以下目标:

grp value   diff
1   10       NA  # diff[1] = value[2]-value[0] of grp = 1
1   15       10  # diff[2] = value[3]-value[1] of grp = 1
1   20       -5  # diff[3] = value[4]-value[2] of grp = 1
1   10       NA  # diff[4] = value[5]-value[3] of grp = 1
2   25       NA  # diff[5] = value[6]-value[4] of grp = 2
2   30       10  # diff[6] = value[7]-value[5] of grp = 2
2   35       NA  # diff[7] = value[8]-value[6] of grp = 2

我尝试使用shiftlag之类的函数,但是无法获得这种类型的解决方案,在这里我要取先前值的差并减去它们,即为diff[i] = value[i+1] - value[i-1]

使用for loop会出错,是否有更好的方法呢?

2 个答案:

答案 0 :(得分:3)

我认为您在描述微分值时有错字。 但是,如果您希望diff[i]成为value[i+1]-value[i-1],则可以同时使用lead中的lagdplyr

library(dplyr)
df %>% group_by(grp) %>% mutate(diff = lead(value) -lag(value))

# A tibble: 7 x 3
# Groups:   grp [2]
    grp value  diff
  <dbl> <dbl> <dbl>
1     1    10    NA
2     1    15    10
3     1    20    -5
4     1    10    NA
5     2    25    NA
6     2    30    10
7     2    35    NA

编辑:绝对差异

如果需要绝对差异,可以执行以下操作:

df %>% group_by(grp) %>% mutate(diff = abs(lead(value) -lag(value)))

# A tibble: 7 x 3
# Groups:   grp [2]
    grp value  diff
  <dbl> <dbl> <dbl>
1     1    10    NA
2     1    15    10
3     1    20     5
4     1    10    NA
5     2    25    NA
6     2    30    10
7     2    35    NA

它看起来像您要找的东西吗?

数据

df = data.frame(grp = c(rep(1,4),rep(2,3)),
                value = c(10,15,20,10,25,30,35))

答案 1 :(得分:2)

按'grp'分组后,我们可以得到lead与'value'的差,并取其中的lag

library(dplyr)
df1 %>%
  group_by(grp)
  mutate(diff = lag(abs(lead(value, 2) - value)))
# A tibble: 7 x 3
# Groups:   grp [2]
#    grp value  diff
#  <int> <int> <int>
#1     1    10    NA
#2     1    15    10
#3     1    20     5
#4     1    10    NA
#5     2    25    NA
#6     2    30    10
#7     2    35    NA

数据

df1 <- structure(list(grp = c(1L, 1L, 1L, 1L, 2L, 2L, 2L), value = c(10L, 
15L, 20L, 10L, 25L, 30L, 35L)), row.names = c(NA, -7L), class = "data.frame")