我需要R程序语言的帮助,我必须回答这个问题:)(a)创建一个基于决策树的监督分类器。 (b)随机分成训练和测试集,以确定分类器的预测质量。
我做了这段代码,但我得到了所有类别的相同结果。有人帮我吗?
libery(tree)
quality<- as.numeric(winequality.red$quality)
range(quality) #8.4 14.9
High = ifelse(winequality.red$quality >= 5, "Yes","No")
winequality.red2 = data.frame(winequality.red, High)
winequality.red2 = winequality.red2[,-12]
#divide data into testing and training
set.seed(2)
train = sample(1:nrow(winequality.red2), nrow(winequality.red2)/2) # half for testing and halof for training
test = -train
training_data = winequality.red2[train, ]
testing_data = winequality.red2[test, ]
testing_Test = High[test]
tree_model = tree(test~., training_data)
plot(tree_model)
text(tree_model, pretty= 0 )
tree_Pred = predict(tree_model, testing_data)
mean(tree_Pred !=testing_data)
答案 0 :(得分:0)
我发现Rpart比树更好。如果这是你的意思,它会在内部进行交叉验证。一定要使用rpart.plot :: prp来很好地绘制它们。从这里有足够的文件包。
答案 1 :(得分:0)
但我正在做的必须在树中,当我改变数字或变量时,我总是得到相同的结果。然后我必须比较随机森林的结果。我必须用这些代码覆盖的排序是 1)(a)基于决策树创建监督分类器。 (b)随机分成训练和测试集,以确定分类器的预测质量
我想确保我做得对不对......