添加一个代表滞后差异的列

时间:2018-02-12 10:58:21

标签: r excel

数据集就像

group  week  sum    

cat     1     10
cat     2      15
cat     3      20
cat     4      30
cat     5      35
dog     1      5
dog     2      10
monkey  1      6
monkey  2      14

要求就像添加一个新的列sum2,就像

sum2是sum(groupwise)-value(n-2),n = week 例如:对于cat:sum = 10 + 15 + 20 + 30 = 75 所以sum2值:

group  week   sum   sum2
cat     1     10    
cat     2      15
cat     3      20   75-week1value=75-10=65
cat     4      30   75-week1 value-week2 value=75-10-15=50
cat      5     35    75-10-15-20=30

1 个答案:

答案 0 :(得分:0)

条件不是很清楚。通过' group'进行分组后,得到'总和的sum lag列和总和的cumsum的累计和(lag)减去'其中n为2

library(dplyr)
df1 %>% 
    group_by(group) %>%
    mutate(sum2 = sum(lag(sum), na.rm = TRUE) - cumsum(lag(sum, n = 2, default = 0)))
# A tibble: 9 x 4
# Groups: group [3]
#  group   week   sum  sum2
#  <chr>  <int> <int> <int>
#1 cat        1    10    75
#2 cat        2    15    75
#3 cat        3    20    65
#4 cat        4    30    50
#5 cat        5    35    30
#6 dog        1     5     5
#7 dog        2    10     5
#8 monkey     1     6     6
#9 monkey     2    14     6

注意:如果是群组总和,那么它应该是sum(lag(sum), na.rm = TRUE)

而不是sum(sum, na.rm = TRUE)