varImp在变量名称中添加了不寻常的后缀

时间:2018-12-06 15:21:14

标签: r

当我在随机目录林上运行varImp时,它在变量名后添加了.Q,.L,.C和^ 4之类的后缀。有谁知道这些指的是什么,或者我做错了什么?

我在包含有序变量和分类变量的数据集上使用了插入符号包。

student <- read.csv("http://cdn-files.soa.org/web/student-success-data-file.csv")

str(student)

# Making certain fields ordered factors
ordered.cat.vars <- c("Medu", "Fedu", "traveltime", "studytime", "famrel", "freetime", "goout", "Dalc", "Walc", "health")
student[,ordered.cat.vars] <- lapply(student[,ordered.cat.vars], factor)
student[,ordered.cat.vars] <- lapply(student[,ordered.cat.vars], ordered)

# Removing certain fields and creating target variable, G3.passflag
library(dplyr)
student <- student %>% select(-one_of(c("absences","G1","G2"))) %>% mutate(G3.passflag = ifelse(G3 >= 10,"pass","fail")) %>% select(-one_of("G3"))

# Running random forest
library(caret)
trctrl <- trainControl(method = "cv", number = 5)
grid <- expand.grid(mtry = seq(1,15,1))

rf_1 <- train(
  form = G3.passflag ~ .,
  data = student,
  method = "rf",
  metric = "Accuracy",
  trControl = trctrl,
  tuneGrid = grid,
  importance = TRUE
)

varImp(rf_1)

对于varImp,我得到以下结果

  only 20 most important variables shown (out of 66)

           Importance
failures       100.00
goout.L         74.31
Medu.L          71.76
Fedu.Q          71.04
Medu.Q          69.52
famsupyes       69.04
goout^4         66.85
Fedu.C          63.28
Fedu.L          61.21
Fedu^4          60.66
Medu.C          60.53
Walc^4          60.25
Walc.C          60.23
Walc.L          58.16
freetime.C      57.65
age             57.39
goout.C         56.50
health.Q        55.75
freetime.Q      55.36
Mjobother       54.63

感谢您提供的任何帮助!

谢谢, 亚历克斯

0 个答案:

没有答案