在R,二项GLM中重现Stata代码

时间:2013-11-20 10:59:58

标签: r stata glm

我不是Stata用户所以我正在尝试重现在R中给我的Stata结果。我想使用具有补充log-log功能的GLM。我的stata代码是:

  

glm c IndA fia,family(二项 s )链接(cloglog)偏移量(偏移量)

R代码是:

glmt <- glm(data=dataset, c ~ IndA + fia, offset = offset, 
            family = binomial(link = cloglog))

这产生了与Stata输出不同的结果。我认为区别在于变量 s ,它包含在Stata的二项式系列中(代码中的粗体)。如何将此变量合并到R代码中?

我的数据集如下:

IndA s c itot fia offset

1 23 0 61 0.442622951 -0.494296322
1 25 0 58 0.431034483 -0.544727175
1 27 0 59 0.389830508 -0.527632742
1 31 3 51 0.37254902 -0.673344553
1 28 2 53 0.41509434 -0.634878272
1 26 0 55 0.436363636 -0.597837001

...

其中IndA是一个虚拟变量,在数据集中稍后为0。 c是s(n - (n + 1))的差异。

R输出如下所示:

    Call:
        glm(formula = c ~ IndA + fia, family = binomial(link = cloglog), 
        data = dataset, offset = offset)

    Deviance Residuals: 
        Min       1Q   Median       3Q      Max  
     -1.2697  -0.9707  -0.8304   1.3688   1.6390  

     Coefficients:
                  Estimate Std. Error z value Pr(>|z|)
     (Intercept)   1.9633     1.9185   1.023    0.306
     IndA         -0.3174     0.3357  -0.945    0.344
     fia          -5.1155     4.8163  -1.062    0.288

    (Dispersion parameter for binomial family taken to be 1)

    Null deviance: 136.81  on 101  degrees of freedom
    Residual deviance: 134.71  on  99  degrees of freedom
    AIC: 140.71

    Number of Fisher Scoring iterations: 5

Stata输出有点乱,但看起来像这样:

    . glm c IndA fia, family(binomial s ) link(cloglog) offset(offset)

    Iteration 0: log likelihood = -144.17967
    Iteration 1: log likelihood = -133.66053
    Iteration 2: log likelihood = -133.58996
    Iteration 3: log likelihood = -133.58992
    Iteration 4: log likelihood = -133.58992

    Generalized linear models No. of obs = 102
    Optimization : ML Residual df = 99
    Scale parameter = 1
    Deviance = 179.1806126 (1/df) Deviance = 1.809905
    Pearson = 203.965157 (1/df) Pearson = 2.060254
    Variance function: V(u) = u*(1-u/s) [Binomial]
    Link function : g(u) = ln(-ln(1-u/s)) [Complementary log-log]
    AIC = 2.678234
    Log likelihood = -133.5899239 BIC = -278.6917
    OIM

    c Coef. Std. Err. z P>|z| [95% Conf. Interval]
    IndA -.7284992 .2308676 -3.16 0.002 -1.180991 -.2760071
    fia -7.147842 3.185532 -2.24 0.025 -13.39137 -.9043128
    _cons .4404201 1.265651 0.35 0.728 -2.040211 2.921051
    offset (offset)

此输出适用于整个数据集:

IndA    s   c   itot    fia offset  
1   23  0   61  0.442622951 -0.494296322    
1   25  0   58  0.431034483 -0.544727175    
1   27  0   59  0.389830508 -0.527632742    
1   31  3   51  0.37254902  -0.673344553    
1   28  2   53  0.41509434  -0.634878272    
1   26  0   55  0.436363636 -0.597837001    
1   26  0   52  0.461538462 -0.653926467    
1   27  0   53  0.433962264 -0.634878272    
1   29  1   50  0.42    -0.693147181    
1   28  0   52  0.423076923 -0.653926467    
1   28  0   56  0.392857143 -0.579818495    
1   30  4   50  0.4 -0.693147181    
1   26  0   57  0.421052632 -0.562118918    
1   26  1   56  0.428571429 -0.579818495    
1   25  0   58  0.431034483 -0.544727175    
1   26  0   56  0.428571429 -0.579818495    
1   29  3   54  0.388888889 -0.616186139    
1   26  3   58  0.413793103 -0.544727175    
1   23  0   62  0.435483871 -0.478035801    
1   23  0   62  0.435483871 -0.478035801    
1   25  0   59  0.423728814 -0.527632742    
1   27  3   54  0.425925926 -0.616186139    
1   24  0   60  0.433333333 -0.510825624    
1   25  0   60  0.416666667 -0.510825624    
1   25  0   60  0.416666667 -0.510825624    
1   26  0   57  0.421052632 -0.562118918    
1   27  0   55  0.418181818 -0.597837001    
1   27  0   53  0.433962264 -0.634878272    
1   27  0   55  0.418181818 -0.597837001    
1   29  0   56  0.375   -0.579818495    
1   31  0   53  0.358490566 -0.634878272    
1   31  0   52  0.365384615 -0.653926467    
1   34  0   50  0.32    -0.693147181    
1   34  1   51  0.31372549  -0.673344553    
1       33  5   55  0.309090909 -0.597837001    
1   28  0   60  0.366666667 -0.510825624    
1   28  1   58  0.379310345 -0.544727175    
1   27  0   58  0.396551724 -0.544727175    
1   28  0   58  0.379310345 -0.544727175    
1   28  1   58  0.379310345 -0.544727175    
1   27  0   59  0.389830508 -0.527632742    
1   27  0   59  0.389830508 -0.527632742    
1   27  0   57  0.403508772 -0.562118918    
1   29  1   53  0.396226415 -0.634878272    
1   28  0   55  0.4 -0.597837001    
1   30  1   54  0.37037037  -0.616186139    
1   29  0   54  0.388888889 -0.616186139    
1   31  1   50  0.38    -0.693147181    
1   30  0   57  0.350877193 -0.562118918    
1   30  4   57  0.350877193 -0.562118918    
1   26  0   61  0.393442623 -0.494296322    
0   16  0   61  0.442622951 -0.494296322    
0   17  3   58  0.431034483 -0.544727175    
0   14  0   59  0.389830508 -0.527632742    
0   18  0   51  0.37254902  -0.673344553    
0   19  0   53  0.41509434  -0.634878272    
0   19  0   55  0.436363636 -0.597837001    
0   22  2   52  0.461538462 -0.653926467    
0   20  0   53  0.433962264 -0.634878272    
0   21  1   50  0.42    -0.693147181    
0   20  4   52  0.423076923 -0.653926467    
0   16  0   56  0.392857143 -0.579818495    
0   20  3   50  0.4 -0.693147181    
0   17  0   57  0.421052632 -0.562118918    
0   18  1   56  0.428571429 -0.579818495    
0   17  0   58  0.431034483 -0.544727175    
0   18  1   56  0.428571429 -0.579818495    
0   17  1   54  0.388888889 -0.616186139    
0   16  1   58  0.413793103 -0.544727175    
0   15  0   62  0.435483871 -0.478035801    
0   15  0   62  0.435483871 -0.478035801    
0   16  0   59  0.423728814 -0.527632742    
0   19  3   54  0.425925926 -0.616186139    
0   16  1   60  0.433333333 -0.510825624    
0   15  0   60  0.416666667 -0.510825624    
0   15  0   60  0.416666667 -0.510825624    
0   17  0   57  0.421052632 -0.562118918    
0   18  0   55  0.418181818 -0.597837001    
0   20  2   53  0.433962264 -0.634878272    
0   18  3   55  0.418181818 -0.597837001    
0   15  0   56  0.375   -0.579818495    
0   16  0   53  0.358490566 -0.634878272    
0   17  1   52  0.365384615 -0.653926467    
0   16  1   50  0.32    -0.693147181    
0   15  3   51  0.31372549  -0.673344553    
0   12  0   55  0.309090909 -0.597837001    
0   12  0   60  0.366666667 -0.510825624    
0   14  0   58  0.379310345 -0.544727175    
0   15  1   58  0.396551724 -0.544727175    
0   14  0   58  0.379310345 -0.544727175    
0   14  0   58  0.379310345 -0.544727175    
0   14  0   59  0.389830508 -0.527632742    
0   14  0   59  0.389830508 -0.527632742    
0   16  0   57  0.403508772 -0.562118918    
0   18  1   53  0.396226415 -0.634878272    
0   17  1   55  0.4 -0.597837001    
0   16  0   54  0.37037037  -0.616186139    
0   17  0   54  0.388888889 -0.616186139    
0   19  6   50  0.38    -0.693147181    
0   13  0   57  0.350877193 -0.562118918    
0   13  0   57  0.350877193 -0.562118918    
0   13  1   61  0.393442623 -0.494296322     

希望这有帮助。
提前谢谢!

1 个答案:

答案 0 :(得分:2)

Stata family name

familyname               Description
-------------------------------------------------------------------------
gaussian                 Gaussian (normal)
igaussian                inverse Gaussian
binomial[varnameN|#N]    Bernoulli/binomial
poisson                  Poisson
nbinomial[#k|ml]         negative binomial
gamma                    gamma

似乎 s 是数据集中的变量之一。如果没有看到您的数据结构,很难说明R glm中该变量的位置。

有关stata glm示例,请参阅此link

stata glm manual

  

二项分布可以指定为1)族(二项式),2)   family(二项式#N)或3)family(二项式varnameN)。在案例2中,#N是   二项分母N的值,试验次数。   指定族(二项式1)与指定相同   家庭(二项式)。在案例3中,varnameN是包含的变量   二项分母,允许试验数量不同   观察结果。