我正在使用“ tempdisagg” R程序包处理时间序列数据。
https://cran.r-project.org/web/packages/tempdisagg/tempdisagg.pdf
https://journal.r-project.org/archive/2013-2/sax-steiner.pdf
使用此软件包,您可以轻松地将低频时间序列(以我为例)每年转换为高频时间序列(以我为例每月)。
例如:
library(tempdisagg)
year_sums <- c(400, 450, 500, 800, 1000, 600)
low_freq_ts = ts(year_sums, frequency = 1, start=2018, end=2023)
model <- td(low_freq_ts ~ 1, conversion = "sum", to = "monthly", method = "denton-cholette")
high_freq_ts = predict(model)
我的问题是现在,如何考虑现有的高频时间序列。
例如:
historic_ts <- c(3, 5, 6, 7, 5, 9, 10, 14, 17, 20, 19, 22)
historic_high_freq_ts = ts(historic_ts, frequency = 12, start = c(2017,1), end = c(2017, 12))
或
historic_ts_2 <- c(3, 5, 6, 7, 5, 9, 10, 14, 17, 20, 19, 22, 20, 24, 27 )
historic_high_freq_ts_2 = ts(historic_ts, frequency = 12, start = c(2017,1), end = c(2018, 3))
最后,我想获得一个平滑的高频时间序列,包括未触及的原始高频数据和分解后的值。
如果我仅将历史值和预测值结合在一起,就会得到一些我不想得到的东西:
plot(ts(
c(historic_high_freq_ts, high_freq_ts)
, frequency = 12
, start = c(2017,1)
, end = c(2023, 12)
)
)
所以最后,disagg函数应该考虑历史趋势,而我不知道该怎么做。 仅使用2017年的137的总和,得出的结果乍一看不错,但稍后会引起问题。
year_sums_2 <- c(sum(historic_ts), 400, 450, 500, 800, 1000, 600)
low_freq_ts_2 = ts(year_sums_2, frequency = 1, start=2017, end=2023)
model_2 <- td(low_freq_ts_2 ~ 1, conversion = "sum", to = "monthly", method = "denton-cholette")
high_freq_ts_2 = predict(model_2)
plot(high_freq_ts_2)
lines(historic_high_freq_ts)
有人知道如何使用tempdisagg软件包或任何其他方法来做到这一点吗?
非常感谢!