statsmodel logit结果与R和Stata非常不同

时间:2015-07-15 17:30:39

标签: logistic-regression statsmodels

我使用SAT分数拟合逻辑回归来预测二元结果 - 双变量相关系数为0.17。 Stata和R(aod包)都给出了logit系数0.004,但statsmodel(python)给出-0.0013(我已经尝试了MLE和IRLS)。没有丢失的数据,并且所有三个平台的观察数量完全相同 - 在每种情况下都使用相同的.csv文件。

R:

Call:
glm(formula = df$outcome ~ df$sat, family = "binomial", data = df)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-0.7527  -0.5911  -0.4778  -0.3406   3.0509  

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept) -7.758e+00  6.274e-02  -123.7   <2e-16 ***
df$sat       4.151e-03  4.351e-05    95.4   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 257024  on 334357  degrees of freedom
Residual deviance: 245878  on 334356  degrees of freedom
AIC: 245882

Number of Fisher Scoring iterations: 5

的Stata:

. logit outcome sat

Iteration 0:   log likelihood = -128512.03  
Iteration 1:   log likelihood = -123233.13  
Iteration 2:   log likelihood = -122939.88  
Iteration 3:   log likelihood =  -122939.1  
Iteration 4:   log likelihood =  -122939.1  

Logistic regression                             Number of obs     =    334,358
                                                LR chi2(1)        =   11145.86
                                                Prob > chi2       =     0.0000
Log likelihood =  -122939.1                     Pseudo R2         =     0.0434

------------------------------------------------------------------------------
     outcome |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         sat |   .0041509   .0000435    95.40   0.000     .0040656    .0042362
       _cons |   -7.75775   .0627402  -123.65   0.000    -7.880719   -7.634782

Statsmodel:

Optimization terminated successfully.
         Current function value: 0.399258
         Iterations 5
                           Logit Regression Results                           
==============================================================================
Dep. Variable:                outcome   No. Observations:               334358
Model:                          Logit   Df Residuals:                   334357
Method:                           MLE   Df Model:                            0
Date:                Wed, 15 Jul 2015   Pseudo R-squ.:                -0.03878
Time:                        13:09:47   Log-Likelihood:            -1.3350e+05
converged:                       True   LL-Null:                   -1.2851e+05
                                        LLR p-value:                     1.000
==============================================================================
                 coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
sat           -0.0013   3.69e-06   -363.460      0.000        -0.001    -0.001


==============================================================================
                 Generalized Linear Model Regression Results                  
==============================================================================
Dep. Variable:                outcome   No. Observations:               334358
Model:                            GLM   Df Residuals:                   334357
Model Family:                Binomial   Df Model:                            0
Link Function:                  logit   Scale:                             1.0
Method:                          IRLS   Log-Likelihood:            -1.3350e+05
Date:                Wed, 15 Jul 2015   Deviance:                   2.6699e+05
Time:                        13:09:48   Pearson chi2:                 3.50e+05
No. Iterations:                     7                                         
==============================================================================
                 coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
sat           -0.0013   3.69e-06   -363.460      0.000        -0.001    -0.001
==============================================================================

0 个答案:

没有答案