请在下面找到模型摘要,并帮助我了解如何根据AIC和p值删除变量。
呼叫:
glm(formula = TARGET ~ duration + cons.price.idx + cons.conf.idx +
emp.var.rate + poutcome + contact + job, family = binomial(link = "logit"),
data = Training)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.9536 -0.3148 -0.1601 -0.1029 3.5807
系数:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.331e+02 7.048e+00 -18.886 < 2e-16 ***
duration 8.333e-03 1.782e-04 46.764 < 2e-16 ***
cons.price.idx 1.396e+00 7.582e-02 18.408 < 2e-16 ***
cons.conf.idx 6.505e-02 5.673e-03 11.466 < 2e-16 ***
emp.var.rate -9.507e-01 3.015e-02 -31.538 < 2e-16 ***
poutcomenonexistent 4.426e-01 8.802e-02 5.029 4.94e-07 ***
poutcomesuccess 2.054e+00 1.263e-01 16.267 < 2e-16 ***
contacttelephone -9.380e-01 8.402e-02 -11.164 < 2e-16 ***
jobblue-collar -4.662e-01 8.778e-02 -5.311 1.09e-07 ***
jobentrepreneur -1.287e-01 1.623e-01 -0.793 0.4279
jobhousemaid -2.577e-01 2.018e-01 -1.277 0.2015
jobmanagement -2.383e-01 1.176e-01 -2.026 0.0428 *
jobretired 2.729e-01 1.222e-01 2.234 0.0255 *
jobself-employed -2.554e-02 1.562e-01 -0.164 0.8701
jobservices -2.343e-01 1.086e-01 -2.156 0.0311 *
jobstudent 3.178e-01 1.486e-01 2.139 0.0324 *
jobtechnician -2.138e-02 8.854e-02 -0.241 0.8092
jobunemployed -9.136e-02 1.890e-01 -0.483 0.6288
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 14499 on 20592 degrees of freedom
Residual deviance: 8568 on 20575 degrees of freedom
AIC: 8604
我的问题是,我什么时候说AIC值好?以及我删除变量的基础是什么,以保持最少的变量。
Fisher评分迭代次数:6
答案 0 :(得分:1)
您所描述的方法被称为逐步逻辑回归,并且由于其糟糕的表现而受到了大量批评。不要使用p值,建议考虑AIC或SC。在http://www2.sas.com/proceedings/sugi26/p222-26.pdf
中给出了SAS中许多示例的一个很好的解释