Question

我在R和Julia中进行了基本的逻辑回归。尽管使用相同的数据我获得了不同的结果。我使用了以下代码：

R：

glm(Yi ~ welfare + married + college + agestar + smokernew + wprestar, 
    data=glm_data, family=binomial())

R Output:
Coefficients:
             Estimate Std. Error z value Pr(>|z|)  
(Intercept)  -2.44746    1.02790  -2.381   0.0173 *
welfare     -13.90825  554.61491  -0.025   0.9800  
married      -0.45701    0.37610  -1.215   0.2243  
college      -0.91454    0.54504  -1.678   0.0934 .
agestar       0.07857    0.13986   0.562   0.5743  
smokernew     0.78939    0.45357   1.740   0.0818 .
wprestar     -0.27257    0.11423  -2.386   0.0170 *

朱莉娅：

glm(Yi ~ welfare + married + college + agestar + smokernew + wprestar,
    glm_data, Binomial(), LogitLink())

Julia Output:
Coefficients:
              Estimate Std.Error   z value Pr(>|z|)
(Intercept)   -2.44746    1.0279  -2.38104   0.0173
welfare       -9.90825   75.0597 -0.132005   0.8950
married      -0.457005  0.376097  -1.21513   0.2243
college      -0.914541  0.545042  -1.67793   0.0934
agestar      0.0785672  0.139856  0.561774   0.5743
smokernew     0.789386  0.453571   1.74038   0.0818
wprestar      -0.27257  0.114234  -2.38605   0.0170

为什么？

除福利变量外，所有系数均相同。我检查了我的数据框，它们完全相同。

Answer 1

如果不查看您的数据，我会猜测您的welfare变量上的响应类已接近完全分离。逻辑尺度上的（+/-）13的估计基本上是（+/-）无穷大，其对应于零或一的估计概率。朱莉娅对-9.9的估计基本上是相同的，除了它可能稍早地终止迭代，因此返回一个稍小的无穷大值。

这称为Hauck-Donner phenomenon，你可以找到questions about it on CrossValidated.com（统计/ ML StackExchange网站）。

不同的GLM结果（R与朱莉娅）

1 个答案: