我遇到了rcppRoll软件包的问题。我想用它来总结过去3个月的价值,但是,有时没有1个月或更长时间的数据。 “ n = 3”考虑的是最近三个观察,而不是最近三个月。我找不到可靠的解决方案,因此我在这里尝试碰运气。预先感谢您的任何建议。
P.S。我更喜欢使用data.table和rcpp_roll,因为我的数据集很大,我对此很熟悉。
代码:
library("data.table")
library("RcppRoll")
test = data.table(id = rep(1, 8),date = c("2015-01","2015-02","2015-03","2015-04","2015-08","2015-09","2015-10","2015-11"), value = 1:8)
test = test[, var:= roll_sumr(value, n = 3, na.rm = TRUE), by = id]
id date value var
1: 1 2015-01 1 NA
2: 1 2015-02 2 NA
3: 1 2015-03 3 6
4: 1 2015-04 4 9
5: 1 2015-08 5 12
6: 1 2015-09 6 15
7: 1 2015-10 7 18
8: 1 2015-11 8 21
预期产量
prefered_outcome = data.table(id = rep(1, 8),date = c("2015-01","2015-02","2015-03","2015-04","2015-08","2015-09","2015-10","2015-11"), value = 1:8,var = c(NA, NA, 6, 9, NA, NA, 18, 21))
id date value var
1: 1 2015-01 1 NA
2: 1 2015-02 2 NA
3: 1 2015-03 3 6
4: 1 2015-04 4 9
5: 1 2015-08 5 NA
6: 1 2015-09 6 NA
7: 1 2015-10 7 18
8: 1 2015-11 8 21
答案 0 :(得分:1)
定义yearmon类的ym
,并检查前一个ym
和第二个roll_sumr
是否早了一个月和两个月,如果是,则使用library(zoo)
ym <- test[, as.yearmon(date)]
test[, roll := ifelse(ym - 1/12 == shift(ym) & ym - 2/12 == shift(ym, 2),
roll_sumr(value, 3, na.rm = TRUE), NA), by = id ]
,否则使用NA。
> test
id date value roll
1: 1 2015-01 1 NA
2: 1 2015-02 2 NA
3: 1 2015-03 3 6
4: 1 2015-04 4 9
5: 1 2015-08 5 NA
6: 1 2015-09 6 NA
7: 1 2015-10 7 18
8: 1 2015-11 8 21
给予:
getText
答案 1 :(得分:0)
您可以先添加缺少的月份,然后再执行功能。之后,添加的月份可以再次删除
library(data.table)
library("RcppRoll")
library(zoo)
test = data.table(id = rep(1, 8),date = c("2015-01","2015-02","2015-03","2015-04","2015-08","2015-09","2015-10","2015-11"), value = 1:8)
test$date <- as.yearmon(test$date)
allMonths <- seq.Date(from=as.Date(test$date[1]),to=as.Date(test$date[length(test$date)]),by="month")
df2 <- data.frame(date=as.yearmon(allMonths))
df3 <- merge(test,df2, all=TRUE)
df3 <- df3[, var:= roll_sumr(value, n = 3, na.rm = TRUE), by = id]
df3