H2o:在培训和测试期间,是否有办法固定H2ORandomForestEstimator性能的阈值?

时间:2019-07-03 03:01:39

标签: performance random-forest h2o threshold

我用H2ORandomForestEstimator建立了一个模型,结果显示如下。

阈值不断变化(转换时为0.5,验证时为0.313725489027),我想在H2ORandomForestEstimator中固定阈值,以便在微调期间进行比较。有没有办法设置阈值?

http://h2o-release.s3.amazonaws.com/h2o/master/3484/docs-website/h2o-py/docs/modeling.html#h2orandomforestestimator中,没有这样的参数。

如果无法设置此值,我们如何知道我们的模型基于什么阈值?

rf_v1
** Reported on train data. **

MSE:    2.75013548238e-05  
RMSE:   0.00524417341664  
LogLoss:0.000494320913199  
Mean Per-Class Error: 0.0188802936476  
AUC: 0.974221763605  
Gini: 0.948443527211  
Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.5:
       0       1    Error    Rate
-----  ------  ---  -------  --------------  
0      161692  1    0        (1.0/161693.0)  
1      3       50   0.0566   (3.0/53.0)  
Total  161695 51   0        (4.0/161746.0)  
Maximum Metrics: Maximum metrics at their respective thresholds

metric                       threshold    value     idx
---------------------------  -----------  --------  -----  
max f1                       0.5          0.961538  19  
max f2                       0.25         0.955056  21  
max f0point5                 0.571429     0.983936  18  
max accuracy                 0.571429     0.999975  18  
max precision                1            1         0  
max recall                   0            1         69  
max specificity              1            1         0  
max absolute_mcc             0.5          0.961704  19  
max min_per_class_accuracy   0.25         0.962264  21  
max mean_per_class_accuracy  0.25         0.98112   21  
Gains/Lift Table: Avg response rate:  0.03 %

** Reported on validation data. **

MSE:      1.00535766226e-05  
RMSE:     0.00317073755183  
LogLoss:  4.53885183426e-05  
Mean Per-Class Error: 0.0  
AUC: 1.0  
Gini: 1.0  
Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.313725489027:
       0      1    Error    Rate
-----  -----  ---  -------  -------------  
0      53715  0    0        (0.0/53715.0)  
1      0      16   0        (0.0/16.0)  
Total  53715  16   0        (0.0/53731.0)  
Maximum Metrics: Maximum metrics at their respective thresholds

metric                       threshold    value    idx
---------------------------  -----------  -------  -----  
max f1                       0.313725     1        5  
max f2                       0.313725     1        5  
max f0point5                 0.313725     1        5  
max accuracy                 0.313725     1        5  
max precision                1            1        0  
max recall                   0.313725     1        5  
max specificity              1            1        0  
max absolute_mcc             0.313725     1        5  
max min_per_class_accuracy   0.313725     1        5  
max mean_per_class_accuracy  0.313725     1        5

1 个答案:

答案 0 :(得分:2)

阈值为max-F1。

如果要应用自己的阈值,则必须采用正分类的概率,然后自己进行比较以产生所需的标签。

如果使用Web浏览器连接到H2O-3内部的H2O Flow Web UI,则可以将鼠标悬停在ROC曲线上,并直观地浏览每个阈值的混淆矩阵。