计算不同列中前一行的列值

时间:2019-07-05 02:54:07

标签: r data.table

我正在尝试根据存储在上一行中但来自不同列的值来更新给定列的值。

我可以使用for循环来做到这一点,该循环适用于小型数据集,但是当处理大型DT(例如超过1MM行)时,此过程当然会花费很多时间。以下是一个小示例:

library(data.table)

DT <- data.table(Year = 2019:2038, Area = 500, Cos = c(0,0,0,150,0,0,
  0,0,350,0,0,0,0,0,0,0,120,200,80,100), Rep = c(0,0,0,0,150,0,0,0,0,
  350,0,0,0,0,0,0,0,0,0,0), Calc = c(500,500,500,500,500,500,500,500,
  500,500,500,500,500,500,500,500,500,380,180,100))

基本上,我想复制列“ Calc”,其计算如下:

1)如果row == 1

Calc[1] == Area[1]

2)对于行> 1

Calc[i] == Rep[i] + Calc[i-1] - Cos[i-1]

任何反馈,我都会感激

非常感谢

2 个答案:

答案 0 :(得分:1)

在这种情况下,您可以使用:

DT[, newCalc := Calc[1L] + cumsum(Rep - shift(Cos, fill=0L))]

输出:

    Year Area Cos Rep Calc    d newCalc
 1: 2019  500   0   0  500    0     500
 2: 2020  500   0   0  500    0     500
 3: 2021  500   0   0  500    0     500
 4: 2022  500 150   0  500    0     500
 5: 2023  500   0 150  500    0     500
 6: 2024  500   0   0  500    0     500
 7: 2025  500   0   0  500    0     500
 8: 2026  500   0   0  500    0     500
 9: 2027  500 350   0  500    0     500
10: 2028  500   0 350  500    0     500
11: 2029  500   0   0  500    0     500
12: 2030  500   0   0  500    0     500
13: 2031  500   0   0  500    0     500
14: 2032  500   0   0  500    0     500
15: 2033  500   0   0  500    0     500
16: 2034  500   0   0  500    0     500
17: 2035  500 120   0  500    0     500
18: 2036  500 200   0  380 -120     380
19: 2037  500  80   0  180 -200     180
20: 2038  500 100   0  100  -80     100

答案 1 :(得分:1)

我们可以将Reduceaccumulate = TRUE一起使用

DT[, newCalc := Reduce(`+`, Rep - shift(Cos, fill = 0), 
         init = Area[1], accumulate = TRUE)[-1]]
DT
#    Year Area Cos Rep Calc newCalc
# 1: 2019  500   0   0  500     500
# 2: 2020  500   0   0  500     500
# 3: 2021  500   0   0  500     500
# 4: 2022  500 150   0  500     500
# 5: 2023  500   0 150  500     500
# 6: 2024  500   0   0  500     500
# 7: 2025  500   0   0  500     500
# 8: 2026  500   0   0  500     500
# 9: 2027  500 350   0  500     500
#10: 2028  500   0 350  500     500
#11: 2029  500   0   0  500     500
#12: 2030  500   0   0  500     500
#13: 2031  500   0   0  500     500
#14: 2032  500   0   0  500     500
#15: 2033  500   0   0  500     500
#16: 2034  500   0   0  500     500
#17: 2035  500 120   0  500     500
#18: 2036  500 200   0  380     380
#19: 2037  500  80   0  180     180
#20: 2038  500 100   0  100     100

accumulate中的tidyverse相同

library(tidyverse)
DT %>% 
   mutate(newCalc = accumulate(Rep - lag(Cos, default = 0),
          .init = first(Area), `+`)[-1])