试图在R

时间:2016-11-13 23:26:43

标签: r tree classification

正如标题所说,我有一个包含y变量和4个x变量的txt文件:

  playtennis  outlook temperature humidity   wind
1         no    sunny         hot     high   weak
2         no    sunny         hot     high strong
3        yes overcast         hot     high   weak
4        yes     rain        mild     high   weak
5        yes     rain        cool   normal   weak
6         no     rain        cool   normal strong

目标是用分类树预测y变量(playtennis)。

所以我决定做一套训练课程:

SamSize <- floor(0.25*nrow(input.dat))
train_ind <- sample(seq_len(nrow(input.dat)), size = SamSize)
train <- input.dat[train_ind, ]
test <- input.dat[-train_ind, ]

然后使用rpart创建分类树:

tree1 = rpart(playtennis ~ outlook  + temperature   + humidity  + wind, data = test, subset = train, method = "class",cp=0.001,xval=20)

但是我收到了错误:

Error in `[.default`(xj, i) : invalid subscript type 'list'

我无法弄清楚出了什么问题。

我是否需要将data.frame表转换为其他内容? 我试过了

as.matrix(train)
as.matrix(test)

并没有解决问题(我想也许它无法识别输入)。

感谢您的建议!

编辑:这是dput()文件,以防它有助于解决此问题。

structure(list(playtennis = structure(c(1L, 1L, 3L, 3L, 3L, 1L, 
3L, 1L, 3L, 3L, 3L, 3L, 3L, 2L), .Label = c("no", "No", "yes"
), class = "factor"), outlook = structure(c(4L, 4L, 1L, 2L, 2L, 
2L, 1L, 4L, 4L, 2L, 4L, 1L, 1L, 3L), .Label = c("overcast", "rain", 
"Rain", "sunny"), class = "factor"), temperature = structure(c(2L, 
2L, 2L, 3L, 1L, 1L, 3L, 3L, 1L, 3L, 3L, 2L, 1L, 4L), .Label = c("cool", 
"hot", "mild", "Mild"), class = "factor"), humidity = structure(c(1L, 
1L, 1L, 1L, 3L, 3L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 2L), .Label = c("high", 
"High", "normal"), class = "factor"), wind = structure(c(2L, 
1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L), .Label = c("strong", 
"weak"), class = "factor")

0 个答案:

没有答案