R中的决策树,预测一个选择

时间:2017-03-28 23:14:11

标签: r dataset classification

我正在用R中的决策树上的iris数据集做一个教程。这是我的基本教程的代码。

library(rpart)
install.packages('rpart.plot')
library(rpart.plot)

s = sample(150,100)

iris_train = iris[s,]
iris_test = iris[-s,]

dtm = rpart(Species~.,iris_train, method="class")

rpart.plot(dtm, type=4, extra=101)

p = predict(dtm,iris_test,type="class")
table(iris_test[,5],p)

表格行给了我:

                 setosa versicolor virginica
    setosa         12          0         0
    versicolor      0         18         0
    virginica       0          3        17

如果我只对Virginica的预测感兴趣,我该怎么办?是否有可能合并其余的值,以便得到Virginica vs Versicolor + Setosa的二进制分类?

1 个答案:

答案 0 :(得分:0)

你可以做你想做的事

library(rpart)
install.packages('rpart.plot')
library(rpart.plot)

s = sample(150,100)
class <- which(iris$Species %in% c("versicolor","setosa")) 

####################################
new_species = rep("virginica",nrow(iris))

new_species[class] <- "vers_seto"

iris$new_species <- new_species
####################################
iris_train = iris[s,-5]    # -5 Delete the old column Species (column number 5) 


iris_test = iris[-s,-5]



dtm = rpart(new_species~.,iris_train, method="class")

rpart.plot(dtm, type=4, extra=101)

p = predict(dtm,iris_test,type="class")
table(iris_test[,5],p)