Question

我有一个包含8个二元预测变量（是/否）和数值结果的数据库。我想找出哪种预测变量组合最适合预测我的结果，但是R的randomForest不喜欢二进制预测变量：我得到了负方差解释，并且在尝试使用“重要性”对预测变量进行评分时出错。

我的代码：

library(randomForest)
#binary predictors
print_size <- c(0,0,0,0,0,1,0) 
mid_ridge <- c(1,1,0,0,1,0,0)
classification <- c(1,1,1,1,1,1,0)
ridge_thickness <- c(1,1,1,1,1,1,1)
delta_center_distance <- c(1,0,1,1,1,1,1)
double_loop_size <- c(0,0,0,0,0,0,1)
whorl_length <- c(0,0,0,0,0,0,1)
loop_angle <- c(0,0,0,1,0,0,1)
#numeric result
LR <- c(44,42,34,20,19,11,9)
pred <- cbind(print_size, mid_ridge, classification, ridge_thickness,
              delta_center_distance, double_loop_size, 
              whorl_length, loop_angle, LR)
output.forest <- randomForest(LR ~ ., ntree=1000,data = pred, importance=TRUE)
print(importance(output.forest,type = 1))

结果：

Mean of squared residuals: 210.327
% Var explained: -18.57

错误

UseMethod（“ importance”）中的错误：没有适用的方法 “重要性”应用于“ c（'standardGeneric'）类的对象， 'genericFunction'，'function'，'OptionalFunction'，'PossibleMethod'， 'optionalMethod'）“

具有二元预测变量的randomForest回归

0 个答案: