我正在直接复制Heckman模型。原始数据和模型是通过Stata运行的,但我使用R,所以我将代码转换为R.我没有对已经提供的复制数据进行任何更改,并将heckman模型的.DO文件行复制到R使用sampleSelection包中的格式。下面是.DO文件代码行(顶部)和我使用的R代码(底部)。
heckprob recip3 polity2_s lntpop_t regime territory vetoplayers_t military_t allybalance powerbalance contig, select(demand=polity2_s lntpop_t vetoplayers_t military_t powerbalance allybalance cinc_s syscon contig peaceyrs _prefail _spline1 _spline2 _spline3)
orig.rep <- selection(selection=demand ~ polity2_s + lntpop_t + vetoplayers_t + military_t + powerbalance + allybalance + cinc_s + syscon + contig + peaceyrs + prefail + spline1 + spline2 + spline3, outcome=recip3 ~ polity2_s + lntpop_t + regime + territory + vetoplayers_t + military_t + allybalance + powerbalance + contig, data=mafordham, method="2step")
相同数据,无变化,观察次数相同,但模型呈现以下结果:
Tobit 2 model (sample selection model)
2-step Heckman / heckit estimation
51363 observations (50937 censored and 426 observed)
28 free parameters (df = 51336)
Probit selection equation:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.423e+00 Inf 0 1
polity2_s -4.841e-03 Inf 0 1
lntpop_t 9.461e-02 Inf 0 1
vetoplayers_t -3.550e-01 Inf 0 1
military_t -1.163e-01 Inf 0 1
powerbalance 2.393e-01 Inf 0 1
allybalance 1.180e-01 Inf 0 1
cinc_s 2.175e+00 Inf 0 1
syscon 9.013e-01 Inf 0 1
contig 5.577e-01 Inf 0 1
peaceyrs -6.177e-02 Inf 0 1
prefail 6.056e-02 Inf 0 1
spline1 -1.611e-04 Inf 0 1
spline2 6.219e-05 Inf 0 1
spline3 -9.030e-07 Inf 0 1
Outcome equation:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.023518 NA NA NA
polity2_s -0.003224 NA NA NA
lntpop_t -0.017170 NA NA NA
regime 0.230042 NA NA NA
territory 0.302224 NA NA NA
vetoplayers_t -0.158959 NA NA NA
military_t -0.179308 NA NA NA
allybalance -0.073449 NA NA NA
powerbalance 0.262756 NA NA NA
contig -0.173194 NA NA NA
Multiple R-Squared:0.1509, Adjusted R-Squared:0.1304
Error terms:
Estimate Std. Error t value Pr(>|t|)
invMillsRatio -0.1746 NA NA NA
sigma 0.4821 NA NA NA
rho -0.3621 NA NA NA
--------------------------------------------
如果我使用“ml”方法而不是“2step”,我会得到以下内容:
Tobit 2 model (sample selection model)
2-step Heckman / heckit estimation
51363 observations (50937 censored and 426 observed)
28 free parameters (df = 51336)
Probit selection equation:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.423e+00 Inf 0 1
polity2_s -4.841e-03 Inf 0 1
lntpop_t 9.461e-02 Inf 0 1
vetoplayers_t -3.550e-01 Inf 0 1
military_t -1.163e-01 Inf 0 1
powerbalance 2.393e-01 Inf 0 1
allybalance 1.180e-01 Inf 0 1
cinc_s 2.175e+00 Inf 0 1
syscon 9.013e-01 Inf 0 1
contig 5.577e-01 Inf 0 1
peaceyrs -6.177e-02 Inf 0 1
prefail 6.056e-02 Inf 0 1
spline1 -1.611e-04 Inf 0 1
spline2 6.219e-05 Inf 0 1
spline3 -9.030e-07 Inf 0 1
Outcome equation:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.023518 Inf 0 1
polity2_s -0.003224 Inf 0 1
lntpop_t -0.017170 Inf 0 1
regime 0.230042 Inf 0 1
territory 0.302224 Inf 0 1
vetoplayers_t -0.158959 Inf 0 1
military_t -0.179308 Inf 0 1
allybalance -0.073449 Inf 0 1
powerbalance 0.262756 Inf 0 1
contig -0.173194 Inf 0 1
Multiple R-Squared:0.1509, Adjusted R-Squared:0.1304
Error terms:
Estimate Std. Error t value Pr(>|t|)
invMillsRatio -0.1746 Inf 0 1
sigma 0.4821 Inf 0 1
rho -0.3621 Inf 0 1
--------------------------------------------
这里发生了什么?
Stata输出:
. use "MIDanalysis.dta"
. heckprob recip3 polity2_s lntpop_t regime territory vetoplayers_t military_t allybalance powerbalance contig, select(demand=polity2
> _s lntpop_t vetoplayers_t military_t powerbalance allybalance cinc_s syscon contig peaceyrs _prefail _spline1 _spline2 _spline3)
Fitting probit model:
Iteration 0: log likelihood = -288.82075
Iteration 1: log likelihood = -258.7276
Iteration 2: log likelihood = -258.6303
Iteration 3: log likelihood = -258.6303
Fitting selection model:
Iteration 0: log likelihood = -2465.7202
Iteration 1: log likelihood = -2136.8238
Iteration 2: log likelihood = -2021.4598
Iteration 3: log likelihood = -2020.9517
Iteration 4: log likelihood = -2020.9514
Iteration 5: log likelihood = -2020.9514
Comparison: log likelihood = -2279.5817
Fitting starting values:
Iteration 0: log likelihood = -295.2807
Iteration 1: log likelihood = -254.98637
Iteration 2: log likelihood = -254.88936
Iteration 3: log likelihood = -254.88936
Fitting full model:
Iteration 0: log likelihood = -2276.3311
Iteration 1: log likelihood = -2275.5899
Iteration 2: log likelihood = -2275.5896
Iteration 3: log likelihood = -2275.5896
Probit model with sample selection Number of obs = 51,363
Censored obs = 50,937
Uncensored obs = 426
Wald chi2(9) = 58.42
Log likelihood = -2275.59 Prob > chi2 = 0.0000
-------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
--------------+----------------------------------------------------------------
recip3 |
polity2_s | -.0086106 .0090562 -0.95 0.342 -.0263605 .0091392
lntpop_t | -.0583329 .0660009 -0.88 0.377 -.1876922 .0710265
regime | .5258624 .2178492 2.41 0.016 .0988857 .9528391
territory | .7307356 .1494662 4.89 0.000 .4377872 1.023684
vetoplayers_t | -.4393531 .3109102 -1.41 0.158 -1.048726 .1700197
military_t | -.4793108 .1577225 -3.04 0.002 -.7884412 -.1701804
allybalance | -.1647296 .2941523 -0.56 0.575 -.7412575 .4117984
powerbalance | .6641183 .5484297 1.21 0.226 -.4107841 1.739021
contig | -.4881928 .1589762 -3.07 0.002 -.7997805 -.1766052
_cons | 1.641254 .8420642 1.95 0.051 -.0091617 3.291669
--------------+----------------------------------------------------------------
demand |
polity2_s | -.0049243 .0027605 -1.78 0.074 -.0103347 .0004861
lntpop_t | .0950719 .0207098 4.59 0.000 .0544814 .1356624
vetoplayers_t | -.354703 .1021106 -3.47 0.001 -.5548362 -.1545699
military_t | -.1143112 .048599 -2.35 0.019 -.2095636 -.0190588
powerbalance | .2545916 .2113132 1.20 0.228 -.1595747 .6687579
allybalance | .1199024 .1016188 1.18 0.238 -.0792667 .3190715
cinc_s | 2.231212 .3110974 7.17 0.000 1.621472 2.840952
syscon | .8490102 .4979293 1.71 0.088 -.1269132 1.824934
contig | .5622448 .0494747 11.36 0.000 .465276 .6592135
peaceyrs | -.0621602 .0089328 -6.96 0.000 -.0796681 -.0446523
_prefail | .0587419 .0083808 7.01 0.000 .0423158 .075168
_spline1 | -.0001617 .0000436 -3.71 0.000 -.000247 -.0000763
_spline2 | .0000621 .0000236 2.63 0.009 .0000158 .0001085
_spline3 | -7.19e-07 4.33e-06 -0.17 0.868 -9.21e-06 7.77e-06
_cons | -3.417173 .2519663 -13.56 0.000 -3.911018 -2.923329
--------------+----------------------------------------------------------------
/athrho | -.547384 .1869418 -2.93 0.003 -.9137831 -.1809849
--------------+----------------------------------------------------------------
rho | -.498557 .1404757 -.7229431 -.1790343
-------------------------------------------------------------------------------
LR test of indep. eqns. (rho = 0): chi2(1) = 7.98 Prob > chi2 = 0.0047