我正在尝试从随机森林交叉验证中计算残差。我正在使用响应变量" Sales"在这个数据集中。我想将残差放入支持向量机。我在R中使用Carseats数据集。到目前为止,这是我的代码:
set.seed (1)
library(ISLR)
data(Carseats)
head(Carseats)
Sales CompPrice Income Advertising Population Price ShelveLoc
1 9.50 138 73 11 276 120 Bad
2 11.22 111 48 16 260 83 Good
3 10.06 113 35 10 269 80 Medium
4 7.40 117 100 4 466 97 Medium
5 4.15 141 64 3 340 128 Bad
6 10.81 124 113 13 501 72 Bad
Age Education Urban US sales
1 42 17 Yes Yes Yes
2 65 10 Yes Yes Yes
3 59 12 Yes Yes Yes
4 55 14 Yes Yes Yes
5 38 13 Yes No Yes
6 78 16 No Yes Yes
##Random forest
#cross validation to pick best mtry from 3,5,10
library(randomForest)
cv.carseats = rfcv(trainx=Carseats[,-1],trainy=Carseats[,1],cv.fold=5,step=0.9)
cv.carseats
with(cv.carseats,plot(n.var,error.cv,type="o"))
#from the graph it would appear mtry=5 produces the lowest error
##SVM
library(e1071)
#cross validation to pick best gamma
tune.out=tune(svm,Sales~.,data=Carseats,gamma=c(0.01,0.1,1,10),
tunecontrol = tune.control(cross=5))
我将取代" Sales"在SVM中随机森林交叉验证的残差。我很难计算随机森林交叉验证中的残差。任何帮助是极大的赞赏!谢谢!