Clustered Standard Error for Zero-Inflated Negative Binomial model

时间:2016-02-12 20:58:42

标签: r

I would like to compute the clustered standard errors for zero-inflated negative binomial model. By default, var viewWidth=document.documentElement.clientWidth || window.innerWidth || document.body.clientWidth; var barHeight=150; if(viewWidth>768) barHeight=80; document.getElementById('bar').style.height=barHeight+'px'; document.getElementById('bar').height=barHeight; (from the zeroinfl package) returns standard errors derived using the Hessian matrix returned by pscl, e.g.:

optim

Is there a way to use an asymmetrical/symmetrical distance matrix between observations OR use one of the variables (e.g. library(pscl) data("bioChemists", package = "pscl") dim(bioChemists) head(bioChemists) ## default start values fm1 <- zeroinfl(art ~ ., data = bioChemists, dist = "negbin")) summary(fm1) in the toy dataset) to compute clustered standard error?

I found this from an answer at stackexchange, but I am not sure how/whether it can be used with zero-inflated models. The equivalent in Stata's kid5 would probably be rzinb under cluster clustvar: http://www.stata.com/manuals13/rzinb.pdf .

Any ideas?

1 个答案:

答案 0 :(得分:4)

R-Forge上的sandwich包的开发版本已经扩展到允许面向对象的聚类协方差计算。这也支持零膨胀回归模型。您可以通过以下方式从R-Forge安装devel版本:

install.packages("sandwich", repos = "http://R-Forge.R-project.org")

然后加载所有必需的包。 lmtest包用于coeftest()函数,协方差矩阵估算可以插入其中。

library("pscl")
library("sandwich")
library("lmtest")

您使用的插图模型如下。

data("bioChemists", package = "pscl")
fm1 <- zeroinfl(art ~ ., data = bioChemists, dist = "negbin")

coeftest()函数默认返回与summary()相同的边缘Wald测试。

coeftest(fm1)
## t test of coefficients:
## 
##                      Estimate  Std. Error t value  Pr(>|t|)    
## count_(Intercept)  0.41674653  0.14359655  2.9022  0.003796 ** 
## count_femWomen    -0.19550683  0.07559256 -2.5863  0.009856 ** 
## count_marMarried   0.09758263  0.08445195  1.1555  0.248199    
## count_kid5        -0.15173246  0.05420606 -2.7992  0.005233 ** 
## count_phd         -0.00070013  0.03626966 -0.0193  0.984603    
## count_ment         0.02478620  0.00349267  7.0966 2.587e-12 ***
## zero_(Intercept)  -0.19168829  1.32281889 -0.1449  0.884815    
## zero_femWomen      0.63593320  0.84891762  0.7491  0.453986    
## zero_marMarried   -1.49946849  0.93867060 -1.5974  0.110518    
## zero_kid5          0.62842720  0.44278263  1.4193  0.156166    
## zero_phd          -0.03771474  0.30800817 -0.1224  0.902572    
## zero_ment         -0.88229322  0.31622813 -2.7901  0.005381 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

然后可以使用vcovCL()函数轻松扩展以使用聚类协方差矩阵估计。在这里,按照您的建议使用kid5变量。 (注意,如果其他人正在读这篇文章:kid5的用法只是为了表明“工作”,但在这个应用程序中没有用。)

coeftest(fm1, vcov = vcovCL(fm1, cluster = bioChemists$kid5))
## t test of coefficients:
## 
##                      Estimate  Std. Error  t value  Pr(>|t|)    
## count_(Intercept)  0.41674653  0.17009748   2.4500   0.01447 *  
## count_femWomen    -0.19550683  0.01701325 -11.4914 < 2.2e-16 ***
## count_marMarried   0.09758263  0.02401883   4.0628 5.272e-05 ***
## count_kid5        -0.15173246  0.03612916  -4.1997 2.938e-05 ***
## count_phd         -0.00070013  0.04852615  -0.0144   0.98849    
## count_ment         0.02478620  0.00263208   9.4170 < 2.2e-16 ***
## zero_(Intercept)  -0.19168829  0.51865043  -0.3696   0.71177    
## zero_femWomen      0.63593320  0.87775846   0.7245   0.46895    
## zero_marMarried   -1.49946849  1.03481783  -1.4490   0.14768    
## zero_kid5          0.62842720  0.35073624   1.7917   0.07351 .  
## zero_phd          -0.03771474  0.13873870  -0.2718   0.78581    
## zero_ment         -0.88229322  0.07481264 -11.7934 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1