如何在Stata中回归分类变量

时间:2015-01-09 00:13:12

标签: stata

我尝试进行多项logit,我的自变量是绝对的。我有两个分类变量 - edu1用于高中学历,edu2用于具有大学学位的学生。变量是虚拟变量(edu1=1表示具有高中学历的学生,edu1=0没有)我想要结果,以便我可以将结果与拥有大学学位的人进行比较。但是,当我执行mlogit edu*时,模型会自动在模型中包含edu1而不是edu2。有没有办法扭转此问题,包括edu2而不包括edu 1?

1 个答案:

答案 0 :(得分:0)

除非丢弃常量,否则不能在模型中同时使用两者。谷歌“虚拟变量陷阱”,看看为什么。这是一个例子:

. webuse sysdsn1, clear
(Health insurance data)

. recode male (0=1) (1=0), gen(female)
(644 differences between male and female)

. mlogit insure male female, nocons nolog

Multinomial logistic regression                   Number of obs   =        616
                                                  Wald chi2(4)    =     149.44
Log likelihood = -553.40712                       Prob > chi2     =     0.0000

------------------------------------------------------------------------------
      insure |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Indemnity    |  (base outcome)
-------------+----------------------------------------------------------------
Prepaid      |
        male |   .3001046   .1703301     1.76   0.078    -.0337363    .6339455
      female |  -.1772065   .0968274    -1.83   0.067    -.3669847    .0125718
-------------+----------------------------------------------------------------
Uninsure     |
        male |  -1.529395   .3059244    -5.00   0.000    -2.128996   -.9297944
      female |  -1.989585   .1884768   -10.56   0.000    -2.358993   -1.620177
------------------------------------------------------------------------------

. mlogit insure male, nolog

Multinomial logistic regression                   Number of obs   =        616
                                                  LR chi2(2)      =       6.38
                                                  Prob > chi2     =     0.0413
Log likelihood = -553.40712                       Pseudo R2       =     0.0057

------------------------------------------------------------------------------
      insure |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Indemnity    |  (base outcome)
-------------+----------------------------------------------------------------
Prepaid      |
        male |    .477311   .1959283     2.44   0.015     .0932987    .8613234
       _cons |  -.1772065   .0968274    -1.83   0.067    -.3669847    .0125718
-------------+----------------------------------------------------------------
Uninsure     |
        male |     .46019   .3593233     1.28   0.200    -.2440708    1.164451
       _cons |  -1.989585   .1884768   -10.56   0.000    -2.358993   -1.620177
------------------------------------------------------------------------------

请注意,在第二个规范中,常数是女性效应,男性变为常数加上男性系数。这与上面没有的常量规范相匹配。

如果模型中有其他假人,事情会变得复杂一些。常量将对应于每组虚拟变量中的所有省略的类别。