我正在构建一些逻辑回归模型,并发现自己使用了插入符包中的varImp('model name')函数。这个函数很有用,但我希望将变量重要性从最重要到最不重要的方式返回。
这是一个可重复的例子:
library(caret)
data("GermanCredit")
Train <- createDataPartition(GermanCredit$Class, p=0.6, list=FALSE)
training <- GermanCredit[ Train, ]
testing <- GermanCredit[ -Train, ]
mod_fit <- glm(Class ~ Age + ForeignWorker + Property.RealEstate +Housing.Own + CreditHistory.Critical, data=training, family=binomial(link = 'logit'))
当我使用代码时:
varImp(mod_fit)
它返回:
Overall
Age 1.747346
ForeignWorker 1.612483
Property.RealEstate 2.715444
Housing.Own 2.066314
CreditHistory.Critical 3.944768
我希望按照“整体”列进行排序:
sort(varImp(mod_fit)$Overall)
它返回:
[1] 1.612483 1.747346 2.066314 2.715444 3.944768
有没有办法将变量名称和重要性级别一起按降序排序?
提前谢谢。
答案 0 :(得分:1)
library(caret)
data("GermanCredit")
Train <- createDataPartition(GermanCredit$Class, p=0.6, list=FALSE)
training <- GermanCredit[ Train, ]
testing <- GermanCredit[ -Train, ]
mod_fit <- glm(Class ~ Age + ForeignWorker + Property.RealEstate +Housing.Own + CreditHistory.Critical, data=training, family=binomial(link = 'logit'))
imp <- as.data.frame(varImp(mod_fit))
imp <- data.frame(overall = imp$Overall,
names = rownames(imp))
imp[order(imp$overall,decreasing = T),]
overall names 3.9234999 CreditHistory.Critical 3.1402835 Housing.Own 2.1955440 Age 1.3042088 ForeignWorker 0.4878837 Property.RealEstate