我有这个数据集:
head(mydata)
chr start stop strand num_probes segment_mean is_nocnv
chr18 52502759 52502887 * 2 -2.3870 YES
chr18 52508963 68598272 * 9546 -0.3843 YES
chrX 17018571 63154896 * 18479 -0.0448 YES
chrX 63161754 63812965 * 265 -0.5375 YES
chrX 63816350 66632343 * 1071 0.1047 YES
chrX 66632547 67941468 * 558 -0.5452 YES
sapply(mydata, class)
chr start stop strand num_probes segment_mean is_nocnv
"factor" "integer" "integer" "factor" "integer" "numeric" "factor"
我的决策树用于预测新的“样本” is_nocnv
是YES
还是NO
我创建了预测树,然后插入了这个新示例:
chr1 <- "chr18"
start1 <- as.integer(52502759)
stop1 <- as.integer(52502887)
strand1 <- "*"
num_probes1 <- as.integer(2)
segment_mean1 <- as.numeric(-2.387)
testData <- data.frame(chr = chr1, start = start1, stop = stop1, strand = strand1,
num_probes = num_probes1, segment_mean = segment_mean1, is_nocnv = "")
testPred <- predict(albero, newdata = testData)
table(testPred, testData$is_nocnv)
我收到此错误:
> testPred <- predict(albero, newdata = testData)
Error in checkData(oldData, RET) :
Levels in factors of new data do not match original data
> table(testPred,testData$is_nocnv)
Error in table(testPred, testData$is_nocnv) :
all arguments must have the same length
我用这一行修改我的代码:
testData <- data.frame(chr=chr1,start=start1,stop=stop1,strand=strand1,num_probes=num_probes1,segment_mean=segment_mean1,is_nocnv=levels=2)
即使结果不是预期的,代码也会产生结果,并返回以下错误
Errore: unexpected '=' in "testData <- data.frame(chr=chr1,start=start1,stop=stop1,strand=strand1,num_probes=num_probes1,segment_mean=segment_mean1,is_nocnv=levels="