为什么带有ARIMA误差的回归模型的回归残差与线性回归模型的残差为何不同?

时间:2019-12-19 14:11:17

标签: time-series regression arima forecast fable-r

让我们开始加载fpp3软件包(https://github.com/robjhyndman/fpp3-package)和美国消费支出数据集(https://rdrr.io/cran/fpp3/man/us_change.html

library(fpp3)

us_change <- readr::read_csv("https://otexts.com/fpp3/extrafiles/us_change.csv") %>%
  mutate(Time = yearquarter(Time)) %>%
  as_tsibble(index = Time)

假设我们要根据收入的变化预测消费的变化,因此我们将收入作为预测指标。

让我们从简单的线性回归模型

开始
fit_lm <- us_change %>% model(TSLM(Consumption ~ Income))

模型报告:

> report(fit_lm)
Series: Consumption 
Model: TSLM 

Residuals:
     Min       1Q   Median       3Q      Max 
-2.40845 -0.31816  0.02558  0.29978  1.45157 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.54510    0.05569   9.789  < 2e-16 ***
Income       0.28060    0.04744   5.915 1.58e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.6026 on 185 degrees of freedom
Multiple R-squared: 0.159,  Adjusted R-squared: 0.1545
F-statistic: 34.98 on 1 and 185 DF, p-value: 1.5774e-08

残差样本:

> residuals(fit_lm)
# A tsibble: 187 x 3 [1Q]
# Key:       .model [1]
   .model                        Time  .resid
   <chr>                        <qtr>   <dbl>
 1 TSLM(Consumption ~ Income) 1970 Q1 -0.202 
 2 TSLM(Consumption ~ Income) 1970 Q2 -0.413 
 3 TSLM(Consumption ~ Income) 1970 Q3 -0.104 
 4 TSLM(Consumption ~ Income) 1970 Q4 -0.748 
 5 TSLM(Consumption ~ Income) 1971 Q1  0.795 
 6 TSLM(Consumption ~ Income) 1971 Q2 -0.0392
 7 TSLM(Consumption ~ Income) 1971 Q3  0.100 
 8 TSLM(Consumption ~ Income) 1971 Q4  0.778 
 9 TSLM(Consumption ~ Income) 1972 Q1  0.640 
10 TSLM(Consumption ~ Income) 1972 Q2  1.06  
# ... with 177 more rows

现在我们假设回归中的误差包含自相关,因此我们使用具有ARIMA误差的回归模型

fit_reg_arima_errs <- us_change %>% model(ARIMA(Consumption ~ Income))

模型报告:

> report(fit_reg_arima_errs)
Series: Consumption 
Model: LM w/ ARIMA(1,0,2) errors 

Coefficients:
         ar1      ma1     ma2  Income  intercept
      0.6922  -0.5758  0.1984  0.2028     0.5990
s.e.  0.1159   0.1301  0.0756  0.0461     0.0884

sigma^2 estimated as 0.3219:  log likelihood=-156.95
AIC=325.91   AICc=326.37   BIC=345.29

在这种情况下,我们有两种残差类型:回归模型的残差和ARIMA模型的残差。

回归残差的样本:

> regression_errors = residuals(fit_reg_arima_errs, type="regression") %>% print()
# A tsibble: 187 x 3 [1Q]
# Key:       .model [1]
   .model                         Time .resid
   <chr>                         <qtr>  <dbl>
 1 ARIMA(Consumption ~ Income) 1970 Q1  0.616
 2 ARIMA(Consumption ~ Income) 1970 Q2  0.460
 3 ARIMA(Consumption ~ Income) 1970 Q3  0.877
 4 ARIMA(Consumption ~ Income) 1970 Q4 -0.274
 5 ARIMA(Consumption ~ Income) 1971 Q1  1.90 
 6 ARIMA(Consumption ~ Income) 1971 Q2  0.912
 7 ARIMA(Consumption ~ Income) 1971 Q3  0.795
 8 ARIMA(Consumption ~ Income) 1971 Q4  1.65 
 9 ARIMA(Consumption ~ Income) 1972 Q1  1.31 
10 ARIMA(Consumption ~ Income) 1972 Q2  1.89 
# ... with 177 more rows

ARIMA残差样本

ARIMA_errors = residuals(fit_reg_arima_errs, type="innovation") %>% print()
# A tsibble: 187 x 3 [1Q]
# Key:       .model [1]
   .model                         Time  .resid
   <chr>                         <qtr>   <dbl>
 1 ARIMA(Consumption ~ Income) 1970 Q1 -0.167 
 2 ARIMA(Consumption ~ Income) 1970 Q2 -0.320 
 3 ARIMA(Consumption ~ Income) 1970 Q3  0.0720
 4 ARIMA(Consumption ~ Income) 1970 Q4 -0.694 
 5 ARIMA(Consumption ~ Income) 1971 Q1  1.05  
 6 ARIMA(Consumption ~ Income) 1971 Q2  0.142 
 7 ARIMA(Consumption ~ Income) 1971 Q3 -0.0525
 8 ARIMA(Consumption ~ Income) 1971 Q4  0.695 
 9 ARIMA(Consumption ~ Income) 1972 Q1  0.469 
10 ARIMA(Consumption ~ Income) 1972 Q2  0.788 
# ... with 177 more rows

后一个模型的回归残差不应该与第一个线性回归模型的残差相同(或至少相似)吗?

为什么有ARIMA误差的回归的回归残差与响应变量(Consumption)的值一致?

> us_change[,c("Time","Consumption")]
# A tsibble: 187 x 2 [1Q]
      Time Consumption
     <qtr>       <dbl>
 1 1970 Q1       0.616
 2 1970 Q2       0.460
 3 1970 Q3       0.877
 4 1970 Q4      -0.274
 5 1971 Q1       1.90 
 6 1971 Q2       0.912
 7 1971 Q3       0.795
 8 1971 Q4       1.65 
 9 1972 Q1       1.31 
10 1972 Q2       1.89 
# ... with 177 more rows

我想念什么?

示例代码摘自“预测:原理与实践。/Hyndman,Robin John; Athanasopoulos,George”。 (https://otexts.com/fpp3/

0 个答案:

没有答案