我在R中进行二元逻辑回归,一些自变量代表有序数据。我只是想确保我正确地做到了。在下面的例子中,我创建了样本数据,并根据自变量" I"的假设运行glm()。表示连续数据。然后我使用ordered(I)再次运行它。结果有点不同,所以看起来似乎是一次成功的测试。我的问题是它是否正在做我认为它正在做的事情......例如,它看到整数数据,根据整数的值将其强制转换为顺序数据,并运行glm()使用不同的公式来说明" 1,"之间的距离。 " 2," " 3,"等等可能不一样,因此使它“正确”#34;如果这代表顺序数据。这是对的吗?
> str(gorilla)
'data.frame': 14 obs. of 2 variables:
$ I: int 1 1 1 2 2 2 3 3 4 4 ...
$ D: int 0 0 1 0 0 1 1 1 0 1 ...
> glm.out = glm(D ~ I, family=binomial(logit), data=gorilla)
> summary(glm.out)
...再次尝试订购:
glm.out = glm(D~ ordered(I),family = binomial(logit),data = gorilla)
> summary(glm.out)
PS:如果它有帮助,这里是这些测试的全部输出(我注意到的一件事是非常大的标准错误数字):
Call:
glm(formula = D ~ I, family = binomial(logit), data = gorilla)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.7067 -1.0651 0.7285 1.0137 1.4458
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.0624 1.2598 -0.843 0.399
I 0.4507 0.3846 1.172 0.241
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 19.121 on 13 degrees of freedom
Residual deviance: 17.621 on 12 degrees of freedom
AIC: 21.621
Number of Fisher Scoring iterations: 4
> glm.out = glm(D ~ ordered(I), family=binomial(logit), data=gorilla)
> summary(glm.out)
Call:
glm(formula = D ~ ordered(I), family = binomial(logit), data = gorilla)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.66511 -0.90052 0.00013 0.75853 1.48230
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 3.6557 922.4405 0.004 0.997
ordered(I).L 1.3524 1.2179 1.110 0.267
ordered(I).Q -9.5220 2465.3259 -0.004 0.997
ordered(I).C 0.1282 1.2974 0.099 0.921
ordered(I)^4 13.6943 3307.5816 0.004 0.997
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 19.121 on 13 degrees of freedom
Residual deviance: 14.909 on 9 degrees of freedom
AIC: 24.909
Number of Fisher Scoring iterations: 17
使用的数据:
I,D
1,0
1,0
1,1
2,0
2,0
2,1
3,1
3,1
4,0
4,1
5,0
5,1
5,1
5,1