我正在尝试根据存储在上一行中但来自不同列的值来更新给定列的值。
我可以使用for循环来做到这一点,该循环适用于小型数据集,但是当处理大型DT(例如超过1MM行)时,此过程当然会花费很多时间。以下是一个小示例:
library(data.table)
DT <- data.table(Year = 2019:2038, Area = 500, Cos = c(0,0,0,150,0,0,
0,0,350,0,0,0,0,0,0,0,120,200,80,100), Rep = c(0,0,0,0,150,0,0,0,0,
350,0,0,0,0,0,0,0,0,0,0), Calc = c(500,500,500,500,500,500,500,500,
500,500,500,500,500,500,500,500,500,380,180,100))
基本上,我想复制列“ Calc”,其计算如下:
1)如果row == 1
Calc[1] == Area[1]
2)对于行> 1
Calc[i] == Rep[i] + Calc[i-1] - Cos[i-1]
任何反馈,我都会感激
非常感谢
答案 0 :(得分:1)
在这种情况下,您可以使用:
DT[, newCalc := Calc[1L] + cumsum(Rep - shift(Cos, fill=0L))]
输出:
Year Area Cos Rep Calc d newCalc
1: 2019 500 0 0 500 0 500
2: 2020 500 0 0 500 0 500
3: 2021 500 0 0 500 0 500
4: 2022 500 150 0 500 0 500
5: 2023 500 0 150 500 0 500
6: 2024 500 0 0 500 0 500
7: 2025 500 0 0 500 0 500
8: 2026 500 0 0 500 0 500
9: 2027 500 350 0 500 0 500
10: 2028 500 0 350 500 0 500
11: 2029 500 0 0 500 0 500
12: 2030 500 0 0 500 0 500
13: 2031 500 0 0 500 0 500
14: 2032 500 0 0 500 0 500
15: 2033 500 0 0 500 0 500
16: 2034 500 0 0 500 0 500
17: 2035 500 120 0 500 0 500
18: 2036 500 200 0 380 -120 380
19: 2037 500 80 0 180 -200 180
20: 2038 500 100 0 100 -80 100
答案 1 :(得分:1)
我们可以将Reduce
与accumulate = TRUE
一起使用
DT[, newCalc := Reduce(`+`, Rep - shift(Cos, fill = 0),
init = Area[1], accumulate = TRUE)[-1]]
DT
# Year Area Cos Rep Calc newCalc
# 1: 2019 500 0 0 500 500
# 2: 2020 500 0 0 500 500
# 3: 2021 500 0 0 500 500
# 4: 2022 500 150 0 500 500
# 5: 2023 500 0 150 500 500
# 6: 2024 500 0 0 500 500
# 7: 2025 500 0 0 500 500
# 8: 2026 500 0 0 500 500
# 9: 2027 500 350 0 500 500
#10: 2028 500 0 350 500 500
#11: 2029 500 0 0 500 500
#12: 2030 500 0 0 500 500
#13: 2031 500 0 0 500 500
#14: 2032 500 0 0 500 500
#15: 2033 500 0 0 500 500
#16: 2034 500 0 0 500 500
#17: 2035 500 120 0 500 500
#18: 2036 500 200 0 380 380
#19: 2037 500 80 0 180 180
#20: 2038 500 100 0 100 100
与accumulate
中的tidyverse
相同
library(tidyverse)
DT %>%
mutate(newCalc = accumulate(Rep - lag(Cos, default = 0),
.init = first(Area), `+`)[-1])