我正在使用randomforest来分析600行21个变量的训练集。
# Construct Random Forest Model
rfmodel <- randomForest(default ~ .,
data = train.df,
ntree = 500,
mtry = 4,
importance = TRUE,
LocalImp = TRUE,
replace = FALSE)
print(rfmodel)
这会产生以下结果:
> rfmodel <- randomForest(default ~ .,
+ data = train.df,
+ ntree = 500,
+ mtry = 4,
+ importance = TRUE,
+ LocalImp = TRUE,
+ replace = FALSE)
> Warning message:
> In randomForest.default(m, y, ...) :
> The response has five or fewer unique values. Are you sure you want to do
> regression?
> print(rfmodel)
>Call:
randomForest(formula = default ~ ., data = train.df, ntree = 500, mtry = 4, importance = TRUE, LocalImp = TRUE, replace = FALSE)
Type of random forest: regression
Number of trees: 500
No. of variables tried at each split: 4
Mean of squared residuals: 0.1577596
% Var explained: 23.89
由于某种原因,这是错过了混淆矩阵。当我尝试生成err.rate时,它给了我这个:
头(rfmodel $ err.rate)
NULL
答案 0 :(得分:0)
我认为您想要进行分类,但默认值被视为数字变量。试试class(train.df$default)
。如果这实际上是一个数字变量,则需要在运行RF之前将其转换为一个因子。