我已经搜索了很多,但无法找到相关文档。我正在尝试估算R中横截面时间序列数据的可行广义最小二乘(FGLS)模型。例如:
library(nlme)
foo <- gls(Y ~ factor(panel_ID) + X1 + X2, data = myData,
correlation=corARMA(p=1), method='ML', na.action=na.pass)
当我运行它时(我的数据框非常大,这就是为什么我不在这里包含它),我收到以下错误:
# Error in array(c(X, y), c(N, ncol(X) + 1), list(row.names(dataMod), c(colnames(X), :
# length of 'dimnames' [1] not equal to array extent</code>
是否有人熟悉gls
或nlme
包的内部运作方式,告诉我这里我做错了什么?或者建议另一种方法来解决这个问题(我还试过plm
包)?
答案 0 :(得分:1)
回答本·博尔克纳(Ben Bolkner)。
出现数据NA
的主要原因。请参见下面的模拟:
library(nlme)
# Simulation
n <-100
myData <- data.frame(panel_ID = sample(letters[1:3], n, replace = TRUE), X1 = rnorm(n), X2 = rnorm(n), Y = rnorm(n))
# NA introduction into X1 variable in Row 10.
myData$X1[10] <- NA
foo <- gls(Y ~ factor(panel_ID) + X1 + X2, data = myData,
correlation=corARMA(p=1), method='ML', na.action=na.pass)
它引发错误:
array(c(X,y),c(N,ncol(X)+ 1L),list(row.names(dataMod), c(colnames(X),:'dimnames'[1]的长度不等于数组范围
要消除此问题,您可以删除NA
,就可以了。
# remove NAs
myData <- myData[!is.na(myData$X1), ]
foo <- gls(Y ~ factor(panel_ID) + X1 + X2, data = myData,
correlation=corARMA(p=1), method='ML', na.action=na.pass)
summary(foo)
输出:
Generalized least squares fit by maximum likelihood
Model: Y ~ factor(panel_ID) + X1 + X2
Data: myData
AIC BIC logLik
280.8763 299.0421 -133.4382
Correlation Structure: AR(1)
Formula: ~1
Parameter estimate(s):
Phi
-0.3496918
Coefficients:
Value Std.Error t-value p-value
(Intercept) 0.21510948 0.14041692 1.5319343 0.1289
factor(panel_ID)b -0.27337750 0.25997687 -1.0515455 0.2957
factor(panel_ID)c -0.21930200 0.19704831 -1.1129352 0.2686
X1 -0.00604318 0.09469452 -0.0638177 0.9493
X2 0.23870397 0.09754513 2.4471130 0.0163
Correlation:
(Intr) fctr(pnl_ID)b fctr(pnl_ID)c X1
factor(panel_ID)b -0.649
factor(panel_ID)c -0.787 0.443
X1 -0.065 0.148 0.044
X2 -0.094 0.021 -0.011 0.117
Standardized residuals:
Min Q1 Med Q3 Max
-2.07929137 -0.77670150 -0.01062337 0.52685034 2.43978797
Residual standard error: 0.9935003
Degrees of freedom: 99 total; 94 residual