我有此代码:
mydata= read.csv("/home/file.csv",stringsAsFactors=F)
sapply(mydata, class)
chr start stop strand num_probes segment_mean is_nocnv
那个回报:
"character" "integer" "integer" "character" "integer" "numeric" "character"
我创建我的决策树:
set.seed(1234)
ind <- sample(2,nrow(mydata),replace=TRUE, prob= c(0.7,0.3))
trainData <- mydata[ind==1,]
testData <- mydata[ind==2,]
myFormula <- is_nocnv ~ chr + start + stop + strand + num_probes + segment_mean
albero <- ctree(myFormula, data=trainData)
table(predict(albero),trainData$is_nocnv)
我得到了我的第一个错误:
Error in trafo(data = data, numeric_trafo = numeric_trafo, factor_trafo = factor_trafo, :
data class “character” is not supported
然后是此代码:
chr1 <- as.character("chr19")
start1 <- as.integer(284018)
stop1 <- as.integer(58878226)
strand1 <- as.character("*")
num_probes1 <- as.integer(23929)
segment_mean1 <- as.numeric(-0.0142)
testData <- data.frame(chr=chr1,start=start1,stop=stop1,strand=strand1,num_probes=num_probes1,segment_mean=segment_mean1,is_nocnv=as.character(""))
testPred <- predict(albero,newdata= testData)
table(testPred,testData$is_nocnv)
我在这里遇到第二个错误:
Error in checkData(oldData, RET) :
Levels in factors of new data do not match original data
和
Error in table(testPred, testData$is_nocnv) :
all arguments must have the same length