Question

我正在使用randomforest来分析600行21个变量的训练集。

# Construct Random Forest Model
rfmodel <- randomForest(default ~ .,
                    data = train.df,
                    ntree = 500,
                    mtry = 4,
                    importance = TRUE,
                    LocalImp = TRUE,
                    replace = FALSE)
print(rfmodel)

这会产生以下结果：

> rfmodel <- randomForest(default ~ .,
+ data = train.df,
+ ntree = 500,
+ mtry = 4,
+ importance = TRUE,
+ LocalImp = TRUE,
+ replace = FALSE)

> Warning message:
> In randomForest.default(m, y, ...) :
> The response has five or fewer unique values. Are you sure you want to do 
> regression?

 > print(rfmodel)

>Call:
 randomForest(formula = default ~ ., data = train.df, ntree = 500,      mtry = 4, importance = TRUE, LocalImp = TRUE, replace = FALSE) 
           Type of random forest: regression
                 Number of trees: 500
No. of variables tried at each split: 4

      Mean of squared residuals: 0.1577596
                % Var explained: 23.89

由于某种原因，这是错过了混淆矩阵。当我尝试生成err.rate时，它给了我这个：

头（rfmodel $ err.rate）
NULL

Answer 1

我认为您想要进行分类，但默认值被视为数字变量。试试class(train.df$default)。如果这实际上是一个数字变量，则需要在运行RF之前将其转换为一个因子。

随机森林不会产生错误

1 个答案: