我试图估计一个标准的托盘模型,该模型在零处被审查。
变量
附属变量:幸福
自变量:
' Worktype'和'假日'变量与“就业”相互作用。变量
我使用censReg
包进行了回溯。
censReg(Happiness ~ City + Gender + Employment:Worktype + Employment:Holiday)
但是summary()
会返回以下错误。
Error in printCoefmat(coef(x, logSigma = logSigma), digits = digits) :
'x' must be coefficient matrix/data frame
为了找出原因,我进行了OLS回归。
有一些NA值,我认为是因为模型设计和变量设置(某些变量似乎存在奇点。'Employment' = 0
的人的值为'Worktype' = Unemployed
,{{1这可能是原因?)
'Holidays' = Unemployed
如何忽略NA值并运行tobit回归而不出错?
下面是可重现的代码。
lm(Happiness ~ City + Gender + Employment:Worktype + Employment:Holiday)
Coefficients: (2 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 41.750 9.697 4.305 0.0499 *
CityNew York -44.500 11.197 -3.974 0.0579 .
Gender1 2.750 14.812 0.186 0.8698
Employment:WorktypeUnemployed NA NA NA NA
Employment:WorktypeBluecolor 35.000 17.704 1.977 0.1867
Employment:WorktypeWhitecolor 102.750 14.812 6.937 0.0202 *
Employment:Holiday1 day a week -70.000 22.394 -3.126 0.0889 .
Employment:Holiday2 day a week NA NA NA NA
答案 0 :(得分:3)
如果您逐步调试censReg的调用,则会达到以下maxLik优化:
result <- maxLik(censRegLogLikCross, start = start,
yVec = yVec, xMat = xMat, left = left, right = right,
obsBelow = obsBelow, obsBetween = obsBetween, obsAbove = obsAbove,
...)
使用OLS回归确定的初始条件向量start
包含NA
的两个系数,如您所知:
这会导致maxLik
返回NULL,并显示错误消息:
Return code 100: Initial value out of range.
summary
函数会获得此NULL
,它说明您收到的最终错误消息。
要覆盖此设置,可以设置start
参数:
tobitreg <- censReg(formula = Happiness ~ City + Gender + Employment:Worktype +
Employment:Holiday, start = rep(0,9) )
summary(tobitreg)
Call:
censReg(formula = Happiness ~ City + Gender + Employment:Worktype +
Employment:Holiday, start = rep(0, 9))
Observations:
Total Left-censored Uncensored Right-censored
8 2 6 0
Coefficients:
Estimate Std. error t value Pr(> t)
(Intercept) 38.666 Inf 0 1
CityNew York -50.669 Inf 0 1
Gender1 -360.633 Inf 0 1
Employment:WorktypeUnemployed 0.000 Inf 0 1
Employment:WorktypeBluecolor 345.674 Inf 0 1
Employment:WorktypeWhitecolor 56.210 Inf 0 1
Employment:Holiday1 day a week 346.091 Inf 0 1
Employment:Holiday2 day a week 55.793 Inf 0 1
logSigma 1.794 Inf 0 1
Newton-Raphson maximisation, 141 iterations
Return code 1: gradient close to zero
Log-likelihood: -19.35431 on 9 Df
即使错误消息消失了,结果也不可靠:
回归中的NA系数表示该系数与其他系数线性相关,因此您需要删除其中一些以获得唯一的解决方案。
您怀疑,原因是您在Employement = 0
时只有worktype = Unemployed
,因此模型无法估计Employment:WorktypeUnemployed
的系数。 Employment:Holiday
系数也有同样的问题。
因此,我担心您正在评估的回归模型没有最佳解决方案。
如果摆脱了链接的变量,这将起作用:
tobitreg <- censReg(formula = Happiness ~ City + Gender + Employment )
summary(tobitreg)
Call:
censReg(formula = Happiness ~ City + Gender + Employment)
Observations:
Total Left-censored Uncensored Right-censored
8 2 6 0
Coefficients:
Estimate Std. error t value Pr(> t)
(Intercept) 38.6141 5.7188 6.752 1.46e-11 ***
CityNew York -50.1813 6.4885 -7.734 1.04e-14 ***
Gender1 -70.3859 8.2943 -8.486 < 2e-16 ***
Employment 111.5672 10.0927 11.054 < 2e-16 ***
logSigma 1.7930 0.2837 6.320 2.61e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Newton-Raphson maximisation, 8 iterations
Return code 1: gradient close to zero
Log-likelihood: -19.36113 on 5 Df