从决策树中提取变量名

时间:2019-05-21 15:31:04

标签: r

所以我用tree包在R中建立了一个决策树,并在该树上运行summary()函数会给我:

Classification tree:
tree(formula = High temperature ~ ., data = summer.train)
Variables actually used in tree construction:
[1] "Humidity"      "Cloudy"   "Airy" "Dry"   
"Windy"
Number of terminal nodes:  12
Residual mean deviance:  0.3874 = 377.7 / 975 
Misclassification error rate: 0.08909 = 89 / 999 

我想根据上面的汇总函数获取树结构使用的变量“ airy”,“ dry”等。我有什么办法吗?

1 个答案:

答案 0 :(得分:0)

因此它是链接到:

Used Variables in Tree

实际上,该解决方案对我有用,我使用了著名的垃圾邮件数据集对其进行了测试:

library(kernlab)
library(tree)

data(spam)

spam_tree_def <- tree(type~.,data=spam)
summary(spam_tree_def)

汇总结果:

Classification tree:
tree(formula = type ~ ., data = spam)
Variables actually used in tree construction:
 [1] "charDollar"      "remove"          "charExclamation" "hp"              "capitalLong"     "our"            
 [7] "capitalAve"      "free"            "george"          "edu"            
Number of terminal nodes:  13 
Residual mean deviance:  0.4879 = 2238 / 4588 
Misclassification error rate: 0.08259 = 380 / 4601 

提取所需内容的方法:

as.character(summary(spam_tree_def)$used)

[1] "charDollar"      "remove"          "charExclamation" "hp"              "capitalLong"     "our"            
 [7] "capitalAve"      "free"            "george"          "edu"