Question

我想做二进制分类，一个级别是＃34; top＆＃34;，另一个是＆＃34; bottom＆＃34;。我在h2o包中使用gbm并得到＆＃34; bottom＆＃34;作为积极的阶级和＆＃34;顶部＆＃34;作为消极阶级。这是我的代码：

fit <- h2o.gbm(x = regr.var, y = max.var,
             training_frame = ddd, 
             nfolds = 10, 
             distribution = 'multinomial',
             balance_classes = TRUE)
pred <- as.data.frame(h2o.predict(fit, newdata = eee))
threshold <- 0.5
pred1 <- factor( ifelse(pred[, 'top'] > threshold, 'top', 'bottom') )
err.res<-confusionMatrix(pred1 , hh$score_class)
err.res

结果如下：

Confusion Matrix and Statistics
           Reference
Prediction bottom top
bottom      420   123
top          1     6
Accuracy : 0.7745          
95% CI : (0.7373, 0.8088)
No Information Rate : 0.7655          
P-Value [Acc > NIR] : 0.3279          

Kappa : 0.0657          
Mcnemar's Test P-Value : <2e-16          

Sensitivity : 0.99762         
Specificity : 0.04651         
Pos Pred Value : 0.77348         
Neg Pred Value : 0.85714         
Prevalence : 0.76545         
Detection Rate : 0.76364         
Detection Prevalence : 0.98727         
Balanced Accuracy : 0.52207         

'Positive' Class : bottom

但我想正确预测更多＆＃34; top＆＃34;。我试图将阈值更改为0.3，并且表现更好。但是，我是否应该更改拟合过程以对＆＃34; top＆＃34;进行更多预测。喜欢＆＃34; ROC＆＃34;指标？我应该翻转＆＃34;顶部＆＃34;积极的课程和＆＃34;底部＆＃34;到负面课程，我该如何改变呢？

Answer 1

我想你想在你的函数中添加'正面'参数：

err.res <- confusionMatrix(pred1, hh$score_class, positive="top")

Answer 2

我建议使用h2o.confustionMatrix并使用它来创建不同阈值的矩阵。

实施例。 h2o.confusionMatrix(object = fit, threshold = 0.3)

谢谢，

主治医生

Answer 3

如果您想直接在h2o中声明一个正类，为了具有正确的度量标准（使用h2o.confusionMatrix，h2o.performance等），可以使用函数h2o.relevel。例如，在您的示例中，您应该在模型训练之前添加：

ddd[max.var] <- h2o.relevel(ddd[max.var],'bottom')

（默认情况下，我相信h2o会根据字母顺序决定肯定的类别，并且在您的示例性h2o指标函数应立即起作用）

R如何判断正面和负面因素变量？

3 个答案: