Question

尝试构建randomForest模型，其中feature具有数字和因子字段。

有一个字段model$forest$xlevels，它包含所有使用的因子字段的级别，但也包含值为0的数字字段。

为什么“{level”}中的数字字段包含在xlevels中时，对于它们没有意义。

此外，代码来自predict.randomForest()

 isFactor <- function(x) is.factor(x) & ! is.ordered(x)
 xfactor <- which(sapply(x, isFactor))
 if (length(xfactor) > 0 && "xlevels" %in% names(object$forest)) {
   for (i in xfactor) {
    if (any(! levels(y[[i]]) %in% models$rf$forest$xlevels[[i]]))
      stop("New factor levels not present in the training data")
  } 
}

在predict.randomForest()中，有代码来检查模型构建中的级别（训练数据）和测试数据。

此代码不考虑这个事实，forest$xlevels中也存在数字字段，因此失败导致它进行索引索引匹配。

了解randomForest模型中的forest $ xlevels

0 个答案: