Trafo中的决策树错误和因素水平

时间:2018-09-20 08:30:51

标签: r decision-tree

我有此代码:

mydata= read.csv("/home/file.csv",stringsAsFactors=F)

sapply(mydata, class)
       chr        start         stop       strand   num_probes segment_mean     is_nocnv 

那个回报:

"character"    "integer"    "integer"  "character"    "integer"    "numeric"  "character"

我创建我的决策树:

set.seed(1234)
ind <- sample(2,nrow(mydata),replace=TRUE, prob= c(0.7,0.3))
trainData <- mydata[ind==1,]
testData <- mydata[ind==2,]


myFormula <- is_nocnv ~ chr + start + stop + strand + num_probes + segment_mean
albero <- ctree(myFormula, data=trainData)

table(predict(albero),trainData$is_nocnv)

我得到了我的第一个错误:

Error in trafo(data = data, numeric_trafo = numeric_trafo, factor_trafo = factor_trafo,  : 
  data class “character” is not supported

然后是此代码:

chr1 <- as.character("chr19")
start1 <- as.integer(284018)
stop1 <- as.integer(58878226)
strand1 <- as.character("*")
num_probes1 <- as.integer(23929)
segment_mean1 <- as.numeric(-0.0142)
testData <- data.frame(chr=chr1,start=start1,stop=stop1,strand=strand1,num_probes=num_probes1,segment_mean=segment_mean1,is_nocnv=as.character(""))

testPred <- predict(albero,newdata= testData)
table(testPred,testData$is_nocnv)

我在这里遇到第二个错误:

Error in checkData(oldData, RET) : 
Levels in factors of new data do not match original data

Error in table(testPred, testData$is_nocnv) : 
  all arguments must have the same length

0 个答案:

没有答案