在r中不使用pacf()的部分自相关

时间:2015-02-02 20:38:30

标签: r time-series correlation

如果我认为我理解我想要验证的内容,那么在这种情况下我试图验证部分自相关的计算。 PACF()。

我最终得到的是一些不同的东西。我的理解是,在给定所有先前滞后的情况下,pacf将是最后/最远滞后的回归系数。因此,为了设置一些代码,我使用加拿大的就业数据和F. Diebold的预测元素(1998)第6章

    #Obtain Canadian Employment dataset
caemp <- c(83.090255, 82.7996338824, 84.6344380294, 85.3774583529, 86.197605, 86.5788438824, 88.0497240294, 87.9249263529, 88.465131, 88.3984638824, 89.4494320294, 90.5563753529, 92.272335, 92.1496788824, 93.9564890294, 94.8114863529, 96.583434, 96.9646728824, 98.9954360294, 101.138164353, 102.882122, 103.095394882, 104.006386029, 104.777404353, 104.701732, 102.563504882, 103.558486029, 102.985774353, 102.098281, 101.471734882, 102.550696029, 104.021564353, 105.093652, 105.194954882, 104.594266029, 105.813184353, 105.149642, 102.899434882, 102.354736029, 102.033974353, 102.014299, 101.835654882, 102.018806029, 102.733834353, 103.134062, 103.263354882, 103.866416029, 105.393274353, 107.081242, 108.414274882, 109.297286029, 111.495994353, 112.680072, 113.061304882, 112.376636029, 111.244054353, 107.305192, 106.678644882, 104.678246029, 105.729204353, 107.837082, 108.022364882, 107.281706029, 107.016934353, 106.045452, 106.370704882, 106.049966029, 105.841184353, 106.045452, 106.650644882, 107.393676029, 108.668584353, 109.628702, 110.261894882, 110.920946029, 110.740154353, 110.048622, 108.190324882, 107.057746029, 108.024724353, 109.712692, 111.409654882, 108.765396029, 106.289084353, 103.917902, 100.799874882, 97.3997700294, 93.2438143529, 94.123068, 96.1970798824, 97.2754290294, 96.4561423529, 92.674237, 92.8536228824, 93.4304540294, 93.2055593529, 93.955896, 94.7296738824, 95.5665510294, 95.5459793529, 97.09503, 97.7573598824, 96.1609430294, 96.5861653529, 103.874812, 105.094384882, 106.804276029, 107.786744353, 106.596022, 107.310354882, 106.897156029, 107.210924353, 107.134682, 108.829774882, 107.926196029, 106.298904353, 103.365872, 102.029554882, 99.3000760294, 95.3045073529, 90.50099, 88.0984848824, 86.5150710294, 85.1143943529, 89.033584, 88.8229008824, 88.2666710294, 87.7260053529, 88.102896, 87.6546968824, 88.4004090294, 88.3618013529, 89.031151, 91.0202948824, 91.6732820294, 92.0149173529)

# create time series with the canadian employment dataset
caemp.ts<-ts(caemp, start=c(1961, 1), end=c(1994, 4), frequency=4)
caemp.ts2<-window(caemp.ts,start=c(1961,5), end=c(1993,4))
# set up max lag the book says use sqrt(T) but in this case i'm using 3 for the example
lag.max <- 3
# R Code using pacf()
pacf(caemp.ts2, lag.max=3, plot=F)
# initialize vector to capture the partial autocorrelations
pauto.corr <- rep(0, lag.max)
# Set up lagged data frame
pa.mat <- as.data.frame(caemp.ts2)
for(i in 1:lag.max){
  a <- c(rep(NA, i), pa.mat[1:(length(caemp.ts2) - i),1])
  pa.mat <- cbind(pa.mat, a)
  }
names(pa.mat) <- c("0":lag.max)
# Set up my base linear model
base.lm <- lm(pa.mat[, 1] ~ 1)

### I could not get the for loop to work successfully here
i <- 1
base.lm <- update(base.lm, .~. + pa.mat[,2])
pauto.corr[i]<-base.lm$coefficients[length(base.lm$coefficients)]
i<-2
base.lm <-update(base.lm, .~. + pa.mat[,3])
pauto.corr[i]<-base.lm$coefficients[length(base.lm$coefficients)]
i<-3
base.lm <-update(base.lm, .~. + pa.mat[,4])
pauto.corr[i]<-base.lm$coefficients[length(base.lm$coefficients)]
# Compare results...
round(pauto.corr,3)
pacf(caemp.ts2, lag.max=3, plot=F)

输出

> round(pauto.corr,3)
[1]  0.971 -0.479 -0.072
> pacf(caemp.ts2, lag.max=3, plot=F)
Partial autocorrelations of series ‘caemp.ts2’, by lag
0.25   0.50   0.75 
0.949 -0.244 -0.100 

也许是因为我的例子是季度而不是月度数据,或者我可能只是错了?

0 个答案:

没有答案