我创建了一个NaiveBayes模型,我认为该模型不存在(无法创建),因此不能用于预测。我想知道它会如何以及在预测中该如何处理(如何识别这种模型?)
这是我运行NaiveBayes之后得到的模型:
> NB_TRAIN_model[[24,1]]
Naive Bayes Classifier for Discrete Predictors
Call:
naiveBayes.default(x = X, y = Y, laplace = laplace)
A-priori probabilities:
Y
F I M
1 0 0
Conditional probabilities:
V2
Y [,1] [,2]
F 0.545 NA
I NA NA
M NA NA
V3
Y [,1] [,2]
F 0.445 NA
I NA NA
M NA NA
V4
Y [,1] [,2]
F 0.15 NA
I NA NA
M NA NA
V5
Y [,1] [,2]
F 0.8 NA
I NA NA
M NA NA
V6
Y [,1] [,2]
F 0.3535 NA
I NA NA
M NA NA
V7
Y [,1] [,2]
F 0.163 NA
I NA NA
M NA NA
V8
Y [,1] [,2]
F 0.207 NA
I NA NA
M NA NA
V9
Y [,1] [,2]
F 9 NA
I NA NA
M NA NA
当我尝试根据新数据进行预测时,我得到以下信息:
Error in a[i, j] <- predict(NB_TRAIN_model[[transnames[[i, j]], j]], test[i, :
replacement has length zero
如何在预测模型是否存在之前对其进行检查?
这是我得到模型的方式:
require(party)
require (data.table)
require (e1071)
require (randomForest)
dat1 <- fread('https://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data',stringsAsFactors=T)
## split data to train and test
set.seed(123)
dat1 <- subset(dat1, !is.na(V1))
smp_size<-100
train_ind <- sample(seq_len(nrow(dat1)), size = smp_size)
train <- dat1[train_ind, ]
test <- dat1[-train_ind, ]
#dat1$V1<-as.factor(dat1$V1)
rf <- randomForest(V1 ~ ., data = train, ntree = 10, keep.inbag = TRUE)
rf_train<-predict(rf,train[,V2:V9], nodes=TRUE)
train_nodes<-attr(rf_train,"nodes")
rf_test<-predict(rf,test[,V2:V9], nodes=TRUE)
test_nodes<-attr(rf_test,"nodes")
### each b[i] holds the rows for each terminal node for tree no. i
b<-list()
max_no_terminalnodes<- length (split(train, train_nodes[,1]) )
for (i in 1:(rf$ntree))
{
b[[i]]<-split(train, train_nodes[,i])
if ( max_no_terminalnodes < length (split(train, train_nodes[,i]) ) ) {
max_no_terminalnodes <- length (split(train, train_nodes[,i]) ) }
}
### Holds the naiveBayes models per terminal node per tree
#NB_TRAIN_model<-list()
NB_TRAIN_model<-matrix(list(), nrow=max_no_terminalnodes, ncol=rf$ntree)
NAMETRANS<-matrix(list(), nrow=max_no_terminalnodes, ncol=rf$ntree)
NAMETRANS[] <- 0L
k=0
for (i in 1:(rf$ntree))
{
for (j in as.numeric (names( b[[i]]))){
k=k+1
NB_TRAIN_model[[k,i]]<- naiveBayes(as.factor(V1) ~ ., data = b[[i]][[k]] )
NAMETRANS[k,i]<-j
}
k=0
}
### Find for each row and tree what NaiveBayes model to run (based on the terminal node NB model per tree)
transnames<-matrix(list(), nrow=max_no_terminalnodes, ncol=rf$ntree)
for(j in 1:ncol(transnames)){
for(i in 1:nrow(transnames)){
tryCatch(transnames[i,j] <- which(NAMETRANS[,j]==test_nodes[i,j]),
error = function(e) return(NA)) }
}
a<-matrix(list(), nrow=nrow(test), ncol=rf$ntree)
for(j in 1:ncol(transnames)){
for(i in 1:nrow(transnames)){
a[i,j]<- predict (NB_TRAIN_model[[transnames[[i, j]],j]],test[i,V2:V9],type = "class")
}
}