使用初始值,我希望根据存储在单独列中的增长率,通过id,在data.table列中迭代填充NA。
以下列data.table为例:
library(data.table)
DT <- data.table(id = c("A","A","A","A","B","B","B","B"), date=1:4,
growth=1L+runif(8), index= c(NA,250,NA,NA,NA,300,NA,NA))
> DT
id date growth index
1: A 1 1.654628 NA
2: A 2 1.770219 250
3: A 3 1.255893 NA
4: A 4 1.185985 NA
5: B 1 1.826187 NA
6: B 2 1.055251 300
7: B 3 1.180389 NA
8: B 4 1.204108 NA
基本上,我需要在日期2之后为索引值id:
index_ {i,t} = growth_ {i,t} * index_ {i,t-1}
并且,对于日期2之前的值:
index_ {i,t} = index_ {i,t-1} / growth_ {i,t-1}
我使用了shift,但是这只取代了t + 1处的索引:
DT[, index := growth * shift(index,1L, type="lag")]
更新 期望的结果看起来像
> DT
id date growth index
1: A 1 1.440548 141.2255
2: A 2 1.395092 250.0000
3: A 3 1.793094 313.9733
4: A 4 1.784224 372.3676
5: B 1 1.129264 284.2926
6: B 2 1.978359 300.0000
7: B 3 1.228979 354.1167
8: B 4 1.453433 426.3948
答案 0 :(得分:1)
首先,我们将定义一个带有两个向量values
和growths
的函数,
NA
values
值
values
与非NA
之间的growths
相乘,确定NA
中每个元素与非NA
元素的比率。请注意,这不会捕获有多个非values
值的情况,如果NA
只有apply_growth <- function(values, growths) {
given <- which(!is.na(values))[1]
cumulative_growth <- vapply(
X = seq_along(growths),
FUN.VALUE = numeric(1),
FUN = function(x) {
if (x < given) {
1 / prod(growths[seq(x + 1, given)])
} else if (x > given) {
prod(growths[seq(given + 1, x)])
} else if (x == given) {
1
}
}
)
values[given] * cumulative_growth
}
,则会出错。但我将异常处理留给你,因为你最了解该怎么做。
DT
现在我们将它应用于date
的每个子组。为了确保这一点,我们将指定行必须按DT[
order(date),
index := apply_growth(index, growth),
by = id
]
DT
# id date growth index
# 1: A 1 1.993863 180.7514
# 2: A 2 1.383115 250.0000
# 3: A 3 1.350102 337.5256
# 4: A 4 1.863802 629.0809
# 5: B 1 1.664999 249.2398
# 6: B 2 1.203660 300.0000
# 7: B 3 1.595310 478.5931
# 8: B 4 1.002311 479.6989
排序。
devServer: {
contentBase: resolve(__dirname, 'public'),
clientLogLevel: 'warning',
historyApiFallback: true,
host: '0.0.0.0',
hot: true,
port: 3000,
proxy: {
'/api/*': 'http://localhost:8016'
}