库(随机森林) rfModel2 = randomForest(formula = Purchased~。,data = Network,ntree = 50,importance = TRUE,replace = TRUE)
eval(predvars,data,env)中的错误:object' User ID'找不到
用户ID是唯一且不需要,如何让函数忽略这一点并处理1/0已购买列?
答案 0 :(得分:0)
购买的是价值还是因素?
假设它是一个因素(0:未购买,1:为实例购买),您将使用以下代码进行分类:
df$Purchased<-as.factor(as.character(df$Purchased))
df$Gender<-as.factor(as.character(df$Gender))
rfModel2<-randomForest(Purchased~.,data=df[,-1])
UserID未包含在计算中
现在,如果“购买”是一个数字,即购买的商品数量,那么您应该使用回归:
df$Gender<-as.factor(df$Gender)
rfModel2<-randomForest(Purchased~.,data=df[,-1])
并且由于您的响应变量似乎是一个因素
,您将收到警告> df$Gender<-as.factor(df$Gender)
> rfModel2<-randomForest(Purchased~.,data=df[,-1])
Warning message:
In randomForest.default(m, y, ...) :
The response has five or fewer unique values. Are you sure you want to do regression?
>
希望这会有所帮助