这是logistic reg模型的R代码,
> hrlogis1 <- glm(Attrition~. -Age -DailyRate -Department -Education
> -EducationField -HourlyRate -JobLevel
> -JobRole -MonthlyIncome -MonthlyRate
> -PercentSalaryHike -PerformanceRating
> -StandardHours -StockOptionLevel
> , family=binomial(link = "logit"),data=hrtrain)
其中: 损耗是因变量,休息是所有自变量。
以下是该模型的摘要:
系数:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.25573 0.84329 1.489 0.136464
BusinessTravelTravel_Frequently 1.86022 0.47410 3.924 8.72e-05 ***
BusinessTravelTravel_Rarely 1.28273 0.44368 2.891 0.003839 **
DistanceFromHome 0.03869 0.01138 3.400 0.000673 ***
EnvironmentSatisfaction -0.36484 0.08714 -4.187 2.83e-05 ***
GenderMale 0.52556 0.19656 2.674 0.007499 **
JobInvolvement -0.59407 0.13259 -4.480 7.45e-06 ***
JobSatisfaction -0.37315 0.08671 -4.303 1.68e-05 ***
MaritalStatusMarried 0.23408 0.26993 0.867 0.385848
MaritalStatusSingle 1.37647 0.27511 5.003 5.63e-07 ***
NumCompaniesWorked 0.16439 0.04034 4.075 4.59e-05 ***
OverTimeYes 1.67531 0.20054 8.354 < 2e-16 ***
RelationshipSatisfaction -0.23865 0.08726 -2.735 0.006240 **
TotalWorkingYears -0.12385 0.02360 -5.249 1.53e-07 ***
TrainingTimesLastYear -0.15522 0.07447 -2.084 0.037124 *
WorkLifeBalance -0.30969 0.13025 -2.378 0.017427 *
YearsAtCompany 0.06887 0.04169 1.652 0.098513 .
YearsInCurrentRole -0.10812 0.04880 -2.216 0.026713 *
YearsSinceLastPromotion 0.14006 0.04452 3.146 0.001657 **
YearsWithCurrManager -0.09343 0.04984 -1.875 0.060834 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
现在我想删除那些不重要的内容,在这种情况下,“MaritalStatusMarried”并不重要。 MaritalStatus是一个变量(列),有两个级别“Married”和“Single”。
答案 0 :(得分:0)
怎么样:
数据$ MaritalStatus [data [,num] =&#34;已婚&#34;]&lt; - NA
(其中num =数据中列的编号)
结婚的值将替换为NA,然后您可以再次运行glm模型。