如何从R中排除变量(列)

时间:2018-06-15 08:38:24

标签: r dataframe

这是logistic reg模型的R代码,

> hrlogis1 <- glm(Attrition~. -Age -DailyRate -Department -Education
>                 -EducationField -HourlyRate -JobLevel
>                 -JobRole -MonthlyIncome -MonthlyRate
>                 -PercentSalaryHike -PerformanceRating
>                 -StandardHours -StockOptionLevel
>                 , family=binomial(link = "logit"),data=hrtrain)

其中: 损耗是因变量,休息是所有自变量。

以下是该模型的摘要:

系数:

                                Estimate Std. Error z value Pr(>|z|)    
(Intercept)                      1.25573    0.84329   1.489 0.136464    
BusinessTravelTravel_Frequently  1.86022    0.47410   3.924 8.72e-05 ***
BusinessTravelTravel_Rarely      1.28273    0.44368   2.891 0.003839 ** 
DistanceFromHome                 0.03869    0.01138   3.400 0.000673 ***
EnvironmentSatisfaction         -0.36484    0.08714  -4.187 2.83e-05 ***
GenderMale                       0.52556    0.19656   2.674 0.007499 ** 
JobInvolvement                  -0.59407    0.13259  -4.480 7.45e-06 ***
JobSatisfaction                 -0.37315    0.08671  -4.303 1.68e-05 ***
MaritalStatusMarried             0.23408    0.26993   0.867 0.385848    
MaritalStatusSingle              1.37647    0.27511   5.003 5.63e-07 ***
NumCompaniesWorked               0.16439    0.04034   4.075 4.59e-05 ***
OverTimeYes                      1.67531    0.20054   8.354  < 2e-16 ***
RelationshipSatisfaction        -0.23865    0.08726  -2.735 0.006240 ** 
TotalWorkingYears               -0.12385    0.02360  -5.249 1.53e-07 ***
TrainingTimesLastYear           -0.15522    0.07447  -2.084 0.037124 *  
WorkLifeBalance                 -0.30969    0.13025  -2.378 0.017427 *  
YearsAtCompany                   0.06887    0.04169   1.652 0.098513 .  
YearsInCurrentRole              -0.10812    0.04880  -2.216 0.026713 *  
YearsSinceLastPromotion          0.14006    0.04452   3.146 0.001657 ** 
YearsWithCurrManager            -0.09343    0.04984  -1.875 0.060834 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

现在我想删除那些不重要的内容,在这种情况下,“MaritalStatusMarried”并不重要。 MaritalStatus是一个变量(列),有两个级别“Married”和“Single”。

1 个答案:

答案 0 :(得分:0)

怎么样:

数据$ MaritalStatus [data [,num] =&#34;已婚&#34;]&lt; - NA

(其中num =数据中列的编号)

结婚的值将替换为NA,然后您可以再次运行glm模型。