RIM中的ARIMA模型

时间:2015-08-12 00:27:32

标签: r

全部, 我试图在R中使用ARIMA模型来识别基于状态的设备维护。我在数据帧(dd)中有两个标签,每个标签代表唯一的设备,SQL查询结构如下:

                       tag                time    value
   1: GO.DIBTWS003_BATT_VOLT 2015-05-01 00:00:00 8.600000
18626  GO.MIPLES004_BATT_AVE 2015-08-06 00:00:00 7.700000

我想使用Auto.ARIMA功能来帮助识别两个标签的最佳模型,如下所示:

temp <- setDT(dd)[, list(AR = list(auto.arima(dd$value))), by = tag]

但是,结果似乎不正确。引用Temp [[2]],得到以下结果:

Series: dd$value 
ARIMA(4,1,1)                    

Coefficients:
          ar1      ar2      ar3      ar4      ma1
      -0.0026  -0.0661  -0.0329  -0.0190  -0.6677
s.e.   0.0207   0.0148   0.0120   0.0101   0.0195

sigma^2 estimated as 0.0003371:  log likelihood=48026.14
AIC=-96040.28   AICc=-96040.28   BIC=-95993.29

[[2]]
Series: dd$value 
ARIMA(4,1,1)                    

Coefficients:
          ar1      ar2      ar3      ar4      ma1
      -0.0026  -0.0661  -0.0329  -0.0190  -0.6677
s.e.   0.0207   0.0148   0.0120   0.0101   0.0195

sigma^2 estimated as 0.0003371:  log likelihood=48026.14
AIC=-96040.28   AICc=-96040.28   BIC=-95993.29

每个标签的结果都是相同的。单独查询标签会导致不同的系数:

Series: dd$value 
ARIMA(3,1,4) with drift         

Coefficients:
         ar1     ar2      ar3      ma1      ma2     ma3     ma4   drift
      0.0540  0.8200  -0.1115  -0.4868  -1.0131  0.5252  0.0924  -1e-04
s.e.  0.0681  0.0243   0.0520   0.0678   0.0342  0.0605  0.0330   1e-04

sigma^2 estimated as 0.0002066:  log likelihood=26292.23
AIC=-52566.47   AICc=-52566.45   BIC=-52502.21

AND

Series: dd$value 
ARIMA(0,1,1)                    

Coefficients:
          ma1
      -0.8135
s.e.   0.0062

sigma^2 estimated as 0.0004347:  log likelihood=22828.25
AIC=-45652.5   AICc=-45652.5   BIC=-45638.22

我是R的新手。有人可以解释为什么会这样吗?

编辑:这是一个可重复性最低的例子:

                      tag                time value
1  GO.DIBTWS003_BATT_VOLT 2015-08-05 04:00:00  8.51
2  GO.DIBTWS003_BATT_VOLT 2015-08-05 08:00:00  8.51
3  GO.DIBTWS003_BATT_VOLT 2015-08-05 08:15:00  8.46
4  GO.DIBTWS003_BATT_VOLT 2015-08-05 08:30:00  8.51
5   GO.MIPLES004_BATT_AVE 2015-08-05 07:00:00  7.70
6   GO.MIPLES004_BATT_AVE 2015-08-05 08:30:00  7.70
7   GO.MIPLES004_BATT_AVE 2015-08-05 08:45:00  7.59
8   GO.MIPLES004_BATT_AVE 2015-08-05 09:00:00  7.66
9   GO.MIPLES004_BATT_AVE 2015-08-05 09:15:00  7.72
10  GO.MIPLES004_BATT_AVE 2015-08-05 09:30:00  7.72
11  GO.MIPLES004_BATT_AVE 2015-08-05 09:45:00  7.73

应用temp <- setDT(dd)[, list(AR = list(auto.arima(dd$value))), by = tag]

导致以下结果:

> temp[[2]]
[[1]]
Series: dd$value 
ARIMA(0,1,0)                    

sigma^2 estimated as 0.06818:  log likelihood=-0.76
AIC=3.52   AICc=4.02   BIC=3.83

[[2]]
Series: dd$value 
ARIMA(0,1,0)                    

sigma^2 estimated as 0.06818:  log likelihood=-0.76
AIC=3.52   AICc=4.02   BIC=3.83

应用相同的操作,这次到单个标签会产生以下结果:

                    tag                time value
1 GO.MIPLES004_BATT_AVE 2015-08-05 07:00:00  7.70
2 GO.MIPLES004_BATT_AVE 2015-08-05 08:30:00  7.70
3 GO.MIPLES004_BATT_AVE 2015-08-05 08:45:00  7.59
4 GO.MIPLES004_BATT_AVE 2015-08-05 09:00:00  7.66
5 GO.MIPLES004_BATT_AVE 2015-08-05 09:15:00  7.72
6 GO.MIPLES004_BATT_AVE 2015-08-05 09:30:00  7.72
7 GO.MIPLES004_BATT_AVE 2015-08-05 09:45:00  7.73

Series: dd$value 
ARIMA(0,0,0) with non-zero mean 

Coefficients:
      intercept
         7.6886
s.e.     0.0172

sigma^2 estimated as 0.002069:  log likelihood=11.7
AIC=-19.4   AICc=-16.4   BIC=-19.51

相反,另一个标签分别使用相同的脚本运行:

                    tag                time value
1 GO.DIBTWS003_BATT_VOLT 2015-08-05 04:00:00  8.51
2 GO.DIBTWS003_BATT_VOLT 2015-08-05 08:00:00  8.51
3 GO.DIBTWS003_BATT_VOLT 2015-08-05 08:15:00  8.46
4 GO.DIBTWS003_BATT_VOLT 2015-08-05 08:30:00  8.51

导致:

> temp[[2]]
[[1]]
Series: dd$value 
ARIMA(0,0,0) with non-zero mean 

Coefficients:
      intercept
         8.4975
s.e.     0.0108

sigma^2 estimated as 0.0004688:  log likelihood=9.66
AIC=-15.31   AICc=-3.31   BIC=-16.54

0 个答案:

没有答案