Question

我试图估计一个标准的托盘模型，该模型在零处被审查。

变量

附属变量：幸福

自变量：

城市（芝加哥，纽约），
性别（男性，女性），
就业（0 =失业，1 =就业），
工作类型（失业，蓝色，白色），
假期（失业，每周1天，每周2天）

＆＃39; Worktype＆＃39;和＆＃39;假日＆＃39;变量与“就业”相互作用。变量

我使用censReg包进行了回溯。

censReg(Happiness ~ City + Gender + Employment:Worktype + Employment:Holiday)

但是summary()会返回以下错误。

Error in printCoefmat(coef(x, logSigma = logSigma), digits = digits) : 
  'x' must be coefficient matrix/data frame

为了找出原因，我进行了OLS回归。

有一些NA值，我认为是因为模型设计和变量设置（某些变量似乎存在奇点。'Employment' = 0的人的值为'Worktype' = Unemployed，{{1这可能是原因？）

'Holidays' = Unemployed

如何忽略NA值并运行tobit回归而不出错？

下面是可重现的代码。

lm(Happiness ~ City + Gender + Employment:Worktype + Employment:Holiday)


Coefficients: (2 not defined because of singularities)
                               Estimate Std. Error t value Pr(>|t|)  
(Intercept)                      41.750      9.697   4.305   0.0499 *
CityNew York                    -44.500     11.197  -3.974   0.0579 .
Gender1                           2.750     14.812   0.186   0.8698  
Employment:WorktypeUnemployed        NA         NA      NA       NA  
Employment:WorktypeBluecolor     35.000     17.704   1.977   0.1867  
Employment:WorktypeWhitecolor   102.750     14.812   6.937   0.0202 *
Employment:Holiday1 day a week  -70.000     22.394  -3.126   0.0889 .
Employment:Holiday2 day a week       NA         NA      NA       NA

Answer 1

如果您逐步调试censReg的调用，则会达到以下maxLik优化：

result <- maxLik(censRegLogLikCross, start = start, 
      yVec = yVec, xMat = xMat, left = left, right = right, 
      obsBelow = obsBelow, obsBetween = obsBetween, obsAbove = obsAbove, 
      ...)

使用OLS回归确定的初始条件向量start包含NA的两个系数，如您所知：

就业：失业的工作类型
就业：假日：每周2天

这会导致maxLik返回NULL，并显示错误消息：

Return code 100: Initial value out of range.

summary函数会获得此NULL，它说明您收到的最终错误消息。

要覆盖此设置，可以设置start参数：

tobitreg <- censReg(formula = Happiness ~ City + Gender + Employment:Worktype +      
                      Employment:Holiday, start = rep(0,9) )
summary(tobitreg)

Call:
censReg(formula = Happiness ~ City + Gender + Employment:Worktype + 
    Employment:Holiday, start = rep(0, 9))

Observations:
         Total  Left-censored     Uncensored Right-censored 
             8              2              6              0 

Coefficients:
                               Estimate Std. error t value Pr(> t)
(Intercept)                      38.666        Inf       0       1
CityNew York                    -50.669        Inf       0       1
Gender1                        -360.633        Inf       0       1
Employment:WorktypeUnemployed     0.000        Inf       0       1
Employment:WorktypeBluecolor    345.674        Inf       0       1
Employment:WorktypeWhitecolor    56.210        Inf       0       1
Employment:Holiday1 day a week  346.091        Inf       0       1
Employment:Holiday2 day a week   55.793        Inf       0       1
logSigma                          1.794        Inf       0       1

Newton-Raphson maximisation, 141 iterations
Return code 1: gradient close to zero
Log-likelihood: -19.35431 on 9 Df

即使错误消息消失了，结果也不可靠：

错误= Inf
接近0的梯度：没有最佳值，解决方案是超平面

回归中的NA系数表示该系数与其他系数线性相关，因此您需要删除其中一些以获得唯一的解决方案。

您怀疑，原因是您在Employement = 0时只有worktype = Unemployed，因此模型无法估计Employment:WorktypeUnemployed的系数。 Employment:Holiday系数也有同样的问题。

因此，我担心您正在评估的回归模型没有最佳解决方案。

如果摆脱了链接的变量，这将起作用：

tobitreg <- censReg(formula = Happiness ~ City + Gender + Employment )
summary(tobitreg)
Call:
censReg(formula = Happiness ~ City + Gender + Employment)

Observations:
         Total  Left-censored     Uncensored Right-censored 
             8              2              6              0 

Coefficients:
             Estimate Std. error t value  Pr(> t)    
(Intercept)   38.6141     5.7188   6.752 1.46e-11 ***
CityNew York -50.1813     6.4885  -7.734 1.04e-14 ***
Gender1      -70.3859     8.2943  -8.486  < 2e-16 ***
Employment   111.5672    10.0927  11.054  < 2e-16 ***
logSigma       1.7930     0.2837   6.320 2.61e-10 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Newton-Raphson maximisation, 8 iterations
Return code 1: gradient close to zero
Log-likelihood: -19.36113 on 5 Df

进行Tobit回归时出现奇点错误

1 个答案: