Question

我尝试在逻辑回归模型中使用scale但是我没有看到与原始模型相比结果有任何变化。这是我的错误吗？这是一个示例代码：

 dat <- read.table(text = " cats birds    wolfs     snakes
           0        3        9         7
           1        3        8         4
           1        1        2         8
           0        1        2         3
           0        1        8         3
           1        6        1         2
           0        6        7         1
           1        6        1         5
           0        5        9         7
           1        3        8         7
           1        4        2         7
           0        1        2         3
           0        7        6         3
           1        6        1         1
           0        6        3         9
           1        6        1         1   ",header = TRUE)

原始回归：

 dat_glm<-glm(cats~birds+    wolfs +    snakes,data=dat,family=binomial(link="logit"))
 dat$dat_glm_pred_response<-ifelse(predict(dat_glm,newdata=dat,type='response')>0.5,1,0)
 m<-xtabs(~cats+dat_glm_pred_response,data=dat);m;prop.table(m,2);prop.table(m,1)

原始回归输出：

   dat_glm_pred_response
cats 0 1
   0 5 3
   1 2 6
    dat_glm_pred_response
cats         0         1
   0 0.7142857 0.3333333
   1 0.2857143 0.6666667
    dat_glm_pred_response
cats     0     1
   0 0.625 0.375
   1 0.250 0.750

我使用scale函数来查看它是否有助于获得更高的准确性：

dat_glm_scale<-glm(cats ~    scale(birds) + scale(wolfs) + scale(snakes),data=dat,family=binomial(link="logit"))

但是我得到了相同的结果：

 dat$dat_glm_pred_response1<-ifelse(predict(dat_glm_scale,newdata=dat,type='response')>0.5,1,0)
 m<-xtabs(~cats+dat_glm_pred_response1,data=dat);m;prop.table(m,2);prop.table(m,1)

缩放数据结果：

   dat_glm_pred_response1
cats 0 1
   0 5 3
   1 2 6
    dat_glm_pred_response1
cats         0         1
   0 0.7142857 0.3333333
   1 0.2857143 0.6666667
    dat_glm_pred_response1
cats     0     1
   0 0.625 0.375
   1 0.250 0.750

为什么两个结果相同？有什么想法吗？

Answer 1

以这种方式缩放/居中将导致模型的结果系数和SE的变化，这在您的示例中确实如此。但是，只要您的模型中没有任何交互条款，您就不会期望预测发生变化。

比较模型的完整摘要输出时，您可以看到这一点：

 summary(dat_glm)
 summary(dat_glm_scale)

回答您的主要问题：您的代码和缩放没有任何问题，但您不应期望预测会发生变化。

编辑：Stackexchange上的以下问题提供了有关该主题的更多详细信息： https://stats.stackexchange.com/questions/65898/why-could-centering-independent-variables-change-the-main-effects-with-moderatio

https://stats.stackexchange.com/questions/29781/when-should-you-center-your-data-when-should-you-standardize

如何正确使用逻辑回归中的比例

1 个答案: