我对美国的收入普查数据进行了逻辑回归。有包含分类值的最大列,因此我为它们创建了虚拟变量。但是,当使用statsmodel api拟合所有功能的模型时,我将P值设为0。因此,这表明所有功能都很重要。但是当使用RFE时,我选择了最好的13个变量,然后再次与statsmodel拟合,然后在这些功能中得到了一些P值。我已附上这两种情况的摘要图片。请帮忙!
Generalised Linear Model Regression Results
==============================================================================
Dep. Variable: income No. Observations: 19881
Model: GLM Df Residuals: 19786
Model Family: Binomial Df Model: 94
Link Function: logit Scale: 1.0000
Method: IRLS Log-Likelihood: nan
Date: Tue, 12 Feb 2019 Deviance: nan
Time: 12:02:32 Pearson chi2: 1.86e+19
No. Iterations: 100 Covariance Type: nonrobust
=============================================================================================================
coef std err z P>|z| [0.025 0.975]
-------------------------------------------------------------------------------------------------------------
const -3.655e+15 1.87e+07 -1.95e+08 0.000 -3.66e+15 -3.66e+15
age 1.432e+13 4.72e+04 3.03e+08 0.000 1.43e+13 1.43e+13
fnlwgt 5.905e+08 4.623 1.28e+08 0.000 5.9e+08 5.9e+08
education_num 1.22e+14 4.04e+05 3.02e+08 0.000 1.22e+14 1.22e+14
sex 2.884e+14 1.44e+06 2e+08 0.000 2.88e+14 2.88e+14
capital_gain 1.251e+11 65.690 1.9e+09 0.000 1.25e+11 1.25e+11
capital_loss -4.138e+12 1.45e+04 -2.86e+08 0.000 -4.14e+12 -4.14e+12
hours_per_week 1.823e+13 4.91e+04 3.71e+08 0.000 1.82e+13 1.82e+13
workclass_Local-gov -2.911e+14 3.31e+06 -8.78e+07 0.000 -2.91e+14 -2.91e+14
workclass_Private -1.904e+14 2.8e+06 -6.8e+07 0.000 -1.9e+14 -1.9e+14
workclass_Self-emp-inc 8.989e+13 3.83e+06 2.34e+07 0.000 8.99e+13 8.99e+13
workclass_Self-emp-not-inc -4.628e+14 3.27e+06 -1.41e+08 0.000 -4.63e+14 -4.63e+14
workclass_State-gov -3.777e+14 3.59e+06 -1.05e+08 0.000 -3.78e+14 -3.78e+14
workclass_Without-pay -3.108e+15 2.05e+07 -1.52e+08 0.000 -3.11e+15 -3.11e+15
education_11th 2.813e+12 3.7e+06 7.6e+05 0.000 2.81e+12 2.81e+12
education_12th -1.136e+14 4.95e+06 -2.3e+07 0.000 -1.14e+14 -1.14e+14
education_1st-4th 7.402e+14 7.98e+06 9.28e+07 0.000 7.4e+14 7.4e+14
education_5th-6th 4.503e+14 6.34e+06 7.11e+07 0.000 4.5e+14 4.5e+14
education_7th-8th 6.089e+14 4.84e+06 1.26e+08 0.000 6.09e+14 6.09e+14
education_9th 2.991e+14 4.84e+06 6.18e+07 0.000 2.99e+14 2.99e+14
education_Assoc-acdm -1.031e+14 3.54e+06 -2.91e+07 0.000 -1.03e+14 -1.03e+14
education_Assoc-voc 4.53e+13 3.28e+06 1.38e+07 0.000 4.53e+13 4.53e+13
education_Bachelors 2.333e+14 2.91e+06 8.02e+07 0.000 2.33e+14 2.33e+14
education_Doctorate 5.761e+14 4.83e+06 1.19e+08 0.000 5.76e+14 5.76e+14
education_HS-grad -1.975e+14 2.62e+06 -7.52e+07 0.000 -1.97e+14 -1.97e+14
education_Masters 2.612e+14 3.41e+06 7.65e+07 0.000 2.61e+14 2.61e+14
education_Preschool -2.918e+15 1.28e+07 -2.27e+08 0.000 -2.92e+15 -2.92e+15
education_Prof-school 5.53e+14 4.42e+06 1.25e+08 0.000 5.53e+14 5.53e+14
education_Some-college 9.488e+12 2.66e+06 3.56e+06 0.000 9.49e+12 9.49e+12
marital_status_Married-AF-spouse 1.462e+15 1.88e+07 7.79e+07 0.000 1.46e+15 1.46e+15
marital_status_Married-civ-spouse 2.446e+14 6.09e+06 4.01e+07 0.000 2.45e+14 2.45e+14
marital_status_Married-spouse-absent -2.076e+14 4.51e+06 -4.6e+07 0.000 -2.08e+14 -2.08e+14
marital_status_Never-married -1.339e+14 1.77e+06 -7.59e+07 0.000 -1.34e+14 -1.34e+14
marital_status_Separated -4.297e+13 2.99e+06 -1.43e+07 0.000 -4.3e+13 -4.3e+13
marital_status_Widowed 8.662e+12 3.24e+06 2.68e+06 0.000 8.66e+12 8.66e+12
occupation_Armed-Forces 2.362e+14 2.76e+07 8.56e+06 0.000 2.36e+14 2.36e+14
occupation_Craft-repair -2.512e+14 2.06e+06 -1.22e+08 0.000 -2.51e+14 -2.51e+14
occupation_Exec-managerial 6.448e+14 2.01e+06 3.21e+08 0.000 6.45e+14 6.45e+14
occupation_Farming-fishing -1.977e+14 3.2e+06 -6.18e+07 0.000 -1.98e+14 -1.98e+14
occupation_Handlers-cleaners -3.774e+14 2.74e+06 -1.38e+08 0.000 -3.77e+14 -3.77e+14
occupation_Machine-op-inspct -3.826e+14 2.41e+06 -1.59e+08 0.000 -3.83e+14 -3.83e+14
occupation_Other-service -3.052e+14 2.04e+06 -1.49e+08 0.000 -3.05e+14 -3.05e+14
occupation_Priv-house-serv -1.913e+15 7.25e+06 -2.64e+08 0.000 -1.91e+15 -1.91e+15
occupation_Prof-specialty 3.754e+14 2.14e+06 1.75e+08 0.000 3.75e+14 3.75e+14
occupation_Protective-serv 2.135e+14 3.66e+06 5.84e+07 0.000 2.14e+14 2.14e+14
occupation_Sales 1.643e+14 2.01e+06 8.17e+07 0.000 1.64e+14 1.64e+14
occupation_Tech-support 5.046e+14 3.12e+06 1.62e+08 0.000 5.05e+14 5.05e+14
occupation_Transport-moving -3.408e+14 2.65e+06 -1.29e+08 0.000 -3.41e+14 -3.41e+14
relationship_Not-in-family 1.121e+15 6.07e+06 1.85e+08 0.000 1.12e+15 1.12e+15
relationship_Other-relative 8.547e+14 5.97e+06 1.43e+08 0.000 8.55e+14 8.55e+14
relationship_Own-child 9.944e+14 6.05e+06 1.64e+08 0.000 9.94e+14 9.94e+14
relationship_Unmarried 1.177e+15 6.26e+06 1.88e+08 0.000 1.18e+15 1.18e+15
relationship_Wife 6.359e+14 2.74e+06 2.32e+08 0.000 6.36e+14 6.36e+14
race_Asian-Pac-Islander 2.714e+14 6.53e+06 4.16e+07 0.000 2.71e+14 2.71e+14
race_Black 7.709e+13 5.16e+06 1.49e+07 0.000 7.71e+13 7.71e+13
race_Other 6.827e+13 7.4e+06 9.23e+06 0.000 6.83e+13 6.83e+13
race_White 1.389e+14 4.93e+06 2.82e+07 0.000 1.39e+14 1.39e+14
native_country_Canada -9.805e+14 1.94e+07 -5.05e+07 0.000 -9.8e+14 -9.8e+14
native_country_China -1.33e+15 1.93e+07 -6.9e+07 0.000 -1.33e+15 -1.33e+15
native_country_Columbia -1.916e+15 2.01e+07 -9.54e+07 0.000 -1.92e+15 -1.92e+15
native_country_Cuba -1.241e+15 1.93e+07 -6.43e+07 0.000 -1.24e+15 -1.24e+15
native_country_Dominican-Republic -3.701e+15 2.01e+07 -1.85e+08 0.000 -3.7e+15 -3.7e+15
native_country_Ecuador -1.311e+15 2.43e+07 -5.4e+07 0.000 -1.31e+15 -1.31e+15
native_country_El-Salvador -1.28e+15 1.93e+07 -6.64e+07 0.000 -1.28e+15 -1.28e+15
native_country_England -1.118e+15 1.94e+07 -5.77e+07 0.000 -1.12e+15 -1.12e+15
native_country_France -1.644e+15 2.33e+07 -7.07e+07 0.000 -1.64e+15 -1.64e+15
native_country_Germany -9.648e+14 1.87e+07 -5.15e+07 0.000 -9.65e+14 -9.65e+14
native_country_Greece -1.902e+15 2.39e+07 -7.97e+07 0.000 -1.9e+15 -1.9e+15
native_country_Guatemala -1.132e+15 2.04e+07 -5.56e+07 0.000 -1.13e+15 -1.13e+15
native_country_Haiti -9.144e+14 2.09e+07 -4.38e+07 0.000 -9.14e+14 -9.14e+14
native_country_Honduras -3.053e+15 3.79e+07 -8.06e+07 0.000 -3.05e+15 -3.05e+15
native_country_Hong -6.677e+14 2.71e+07 -2.46e+07 0.000 -6.68e+14 -6.68e+14
native_country_Hungary -5.425e+14 2.75e+07 -1.98e+07 0.000 -5.43e+14 -5.43e+14
native_country_India -1.698e+15 1.88e+07 -9.05e+07 0.000 -1.7e+15 -1.7e+15
native_country_Iran -1.11e+15 2.16e+07 -5.14e+07 0.000 -1.11e+15 -1.11e+15
native_country_Ireland -9.832e+14 2.45e+07 -4.01e+07 0.000 -9.83e+14 -9.83e+14
native_country_Italy -9.097e+14 2.04e+07 -4.46e+07 0.000 -9.1e+14 -9.1e+14
native_country_Jamaica -1.099e+15 1.95e+07 -5.63e+07 0.000 -1.1e+15 -1.1e+15
native_country_Japan -1.382e+15 2.02e+07 -6.84e+07 0.000 -1.38e+15 -1.38e+15
native_country_Laos -1.011e+15 2.71e+07 -3.73e+07 0.000 -1.01e+15 -1.01e+15
native_country_Mexico -9.333e+14 1.77e+07 -5.26e+07 0.000 -9.33e+14 -9.33e+14
native_country_Nicaragua -1.367e+15 2.21e+07 -6.17e+07 0.000 -1.37e+15 -1.37e+15
native_country_Outlying-US(Guam-USVI-etc) -4.612e+15 2.75e+07 -1.68e+08 0.000 -4.61e+15 -4.61e+15
native_country_Peru -2.044e+15 2.28e+07 -8.98e+07 0.000 -2.04e+15 -2.04e+15
native_country_Philippines -9.297e+14 1.79e+07 -5.19e+07 0.000 -9.3e+14 -9.3e+14
native_country_Poland -1.01e+15 2.06e+07 -4.91e+07 0.000 -1.01e+15 -1.01e+15
native_country_Portugal -1.147e+15 2.21e+07 -5.18e+07 0.000 -1.15e+15 -1.15e+15
native_country_Puerto-Rico -1.173e+15 1.9e+07 -6.17e+07 0.000 -1.17e+15 -1.17e+15
native_country_Scotland -1.76e+15 2.94e+07 -5.98e+07 0.000 -1.76e+15 -1.76e+15
native_country_South -1.758e+15 2.02e+07 -8.7e+07 0.000 -1.76e+15 -1.76e+15
native_country_Taiwan -1.977e+15 2.09e+07 -9.48e+07 0.000 -1.98e+15 -1.98e+15
native_country_Thailand -2.047e+15 2.8e+07 -7.3e+07 0.000 -2.05e+15 -2.05e+15
native_country_Trinadad&Tobago -7.358e+14 2.6e+07 -2.83e+07 0.000 -7.36e+14 -7.36e+14
native_country_United-States -1.12e+15 1.74e+07 -6.44e+07 0.000 -1.12e+15 -1.12e+15
native_country_Vietnam -1.745e+15 1.95e+07 -8.97e+07 0.000 -1.75e+15 -1.75e+15
native_country_Yugoslavia -1.369e+15 2.5e+07 -5.47e+07 0.000 -1.37e+15 -1.37e+15
=============================================================================================================
仅选择13个功能
Generalized Linear Model Regression Results
==============================================================================
Dep. Variable: income No. Observations: 19881
Model: GLM Df Residuals: 19867
Model Family: Binomial Df Model: 13
Link Function: logit Scale: 1.0000
Method: IRLS Log-Likelihood: -7396.5
Date: Tue, 12 Feb 2019 Deviance: 14793.
Time: 12:47:13 Pearson chi2: 1.79e+05
No. Iterations: 23 Covariance Type: nonrobust
=====================================================================================================
coef std err z P>|z| [0.025 0.975]
-----------------------------------------------------------------------------------------------------
const -2.4781 0.047 -52.595 0.000 -2.570 -2.386
education_1st-4th -1.9341 0.602 -3.214 0.001 -3.114 -0.755
education_Preschool -35.4880 1.71e+05 -0.000 1.000 -3.36e+05 3.36e+05
marital_status_Married-AF-spouse 2.9904 0.608 4.918 0.000 1.799 4.182
marital_status_Married-civ-spouse 2.1612 0.052 41.962 0.000 2.060 2.262
occupation_Farming-fishing -1.5661 0.151 -10.387 0.000 -1.862 -1.271
occupation_Handlers-cleaners -1.6452 0.161 -10.211 0.000 -1.961 -1.329
occupation_Other-service -1.5677 0.119 -13.174 0.000 -1.801 -1.334
occupation_Priv-house-serv -4.2584 1.892 -2.250 0.024 -7.968 -0.549
relationship_Other-relative -1.5509 0.268 -5.789 0.000 -2.076 -1.026
relationship_Own-child -1.8791 0.175 -10.711 0.000 -2.223 -1.535
native_country_Columbia -2.6487 1.038 -2.551 0.011 -4.684 -0.614
native_country_Dominican-Republic -22.6701 1.63e+04 -0.001 0.999 -3.2e+04 3.19e+04
capital_gain 0.0003 1.17e-05 28.530 0.000 0.000 0.000
=====================================================================================================