分解时间序列数据

时间:2016-10-18 08:07:00

标签: r stl forecasting

任何人都可以帮助解密ucm的输出。我的主要目标是检查ts数据是否是季节性的。但是我不能每次都能看到和看。我需要自动化整个过程并提供季节性指标。

我想了解以下输出

ucmxmodel$s.season

# Time Series:
#     Start = c(1, 1) 
#     End = c(4, 20) 
#     Frequency = 52 
#       [1] -2.391635076 -2.127871717 -0.864021134  0.149851212 -0.586660213 -0.697838635 -0.933982269  0.954491859 -1.531715424 -1.267769820 -0.504165631
#      [12] -1.990792301  1.273673437  1.786860414  0.050859315 -0.685677002 -0.921831488 -1.283081922 -1.144376739 -0.964042949 -1.510837956  1.391991657
#      [23] -0.261175626  5.419494363  0.543898305  0.002548125  1.126895943  1.474427901  2.154721023  2.501352782  0.515453691 -0.470886132  1.209419689

ucmxmodel$vs.season

# [1] 1.375832 1.373459 1.371358 1.369520 1.367945 1.366632 1.365582 1.364795 1.364270 1.364007 1.364007 1.364270 1.364795 1.365582 1.366632 1.367945
# [17] 1.369520 1.371358 1.373459 1.375816 1.784574 1.784910 1.785223 1.785514 1.785784 1.786032 1.786258 1.786461 1.786643 1.786802 1.786938 1.787052
# [33] 1.787143 1.787212 1.787257 1.787280 1.787280 1.787257 1.787212 1.787143 1.787052 1.786938 1.786802 1.786643 1.786461 1.786258 1.786032 1.785784
# [49] 1.785514 1.785223 1.784910 1.784578 1.375641 1.373276 1.371175 1.369337 1.367762 1.366449 1.365399 1.364612 1.364087 1.363824 1.363824 1.364087
# [65] 1.364612 1.365399 1.366449 1.367762 1.369337 1.371175 1.373276 1.375636 1.784453 1.784788 1.785101 1.785392 1.785662 1.785910 1.786136 1.786339

ucmxmodel$est.var.season

# Season_Variance 
#     0.0001831373 

如何在不查看图表的情况下使用上述信息来确定季节性以及在什么级别(每周,每月,每季度或每年)?

另外,我在est

中得到NULL
ucmxmodel$est

# NULL

数据

测试数据为:

structure(c(44, 81, 99, 25, 69, 42, 6, 25, 75, 90, 73, 65, 55, 
9, 53, 43, 19, 28, 48, 71, 36, 1, 66, 46, 55, 56, 100, 89, 29, 
93, 55, 56, 35, 87, 77, 88, 18, 32, 6, 2, 15, 36, 48, 80, 48, 
2, 22, 2, 97, 14, 31, 54, 98, 43, 62, 94, 53, 17, 45, 92, 98, 
7, 19, 84, 74, 28, 11, 65, 26, 97, 67, 4, 25, 62, 9, 5, 76, 96, 
2, 55, 46, 84, 11, 62, 54, 99, 84, 7, 13, 26, 18, 42, 72, 1, 
83, 10, 6, 32, 3, 21, 100, 100, 98, 91, 89, 18, 88, 90, 54, 49, 
5, 95, 22), .Tsp = c(1, 3.15384615384615, 52), class = "ts")

structure(c(40, 68, 50, 64, 26, 44, 108, 90, 62, 60, 90, 64, 120, 82, 68, 60,
26, 32, 60, 74, 34, 16, 22, 44, 50, 16, 34, 26, 42, 14, 36, 24, 14, 16, 6, 6,
12, 20, 10, 34, 12, 24, 46, 30, 30, 46, 54, 42, 44, 42, 12, 52, 42, 66, 40,
60, 42, 44, 64, 96, 70, 52, 66, 44, 64, 62, 42, 86, 40, 56, 50, 50, 62, 22,
24, 14, 14, 18, 18, 10, 20, 10, 4, 18, 10, 10, 14, 20, 10, 32, 12, 22, 20, 20,
26, 30, 36, 28, 56, 34, 14, 54, 40, 30, 42, 36, 52, 30, 32, 52, 42, 62, 46,
64, 70, 48, 40, 64, 40, 120, 58, 36, 40, 34, 36, 26, 18, 28, 16, 32, 18, 12,
20), .Tsp = c(1, 4.36, 52), class = "ts")

2 个答案:

答案 0 :(得分:1)

我认为最直接的方法是遵循Rob Hyndman's approach(他是R中许多时间序列包的作者)。对于您的数据,它将如下工作,

require(fma)
# Create a model with multiplicative errors (see https://www.otexts.org/fpp/7/7).
fit1 <- stlf(test2)
# Create a model with additive errors.
fit2 <- stlf(data, etsmodel = "ANN")

deviance <- 2 * c(logLik(fit1$model) - logLik(fit2$model))
df <- attributes(logLik(fit1$model))$df - attributes(logLik(fit2$model))$df

# P-value
1 - pchisq(deviance, df)

# [1] 1

基于此分析,我们发现 p值为1会导致我们得出结论没有季节性。

答案 1 :(得分:0)

我非常喜欢R中提供的stl()函数。试试这个最小的例子:

# some random data
x <- rnorm(200)
# as a time series object
xt <- ts(x, frequency = 10)
# do the decomposition
xts <- stl(xt, s.window = "periodic")
# plot the results
plot(xts)

现在,您可以通过比较差异来估算“季节性”。

vars <- apply(xts$time.series, 2, var)
vars['seasonal'] / sum(vars)

现在,您可以将季节性方差作为分解后方差总和的一部分。

我强烈建议您阅读original paper,以便了解这里发生的事情。它很容易访问,我喜欢这种方法,因为它非常直观。