尝试使用例如
获取召回分数时rf_model.recall()
我收到错误:
h2o ValueError: No metric tpr
我可以获得其他指标,例如准确度,AUC,精度和F1但不记得...... 这可能是一个错误。
如果我跑:
from h2o.model.metrics_base import H2OBinomialModelMetrics as bmm
reporter = bmm(rf_model.metric)
rf_model.metric('recall')
我明白了:
Could not find exact threshold 0.0; using closest threshold found 0.0.
发生了什么事?
我正在运行h2o版本'h2o-3.15.0.3990'。
我遵循了h2o教程:
并使用我自己的数据集,我得到上述错误。
任何帮助?
另外,如何使用h2o绘制精度/召回曲线?
由于
答案 0 :(得分:1)
从第二个问题开始,Flow具有精确/召回曲线(并且它是交互式的)。 Flow始终在每个节点的端口54321上运行,如果您在本地运行h2o,则为http://127.0.0.1:54321
。
我认为您的数据或模型有一些有趣的东西,当您查看精确度/召回曲线时,它将变得清晰。
如果您执行str(m)
(其中m
是您的模型),您将看到所有模型数据。 m@training_metrics@metrics$thresholds_and_metric_scores$recall
保存每个阈值的召回号码。
我无法弄清楚如何查看Python对象,但是你的调用是正确的。在我的快速测试中(添加了2类枚举列的虹膜数据集):
m.metric("recall")
得到:
[[0.8160852636726422, 1.0]]
如果我想要所有的值,它将是这样的:
mDL.metric("recall",thresholds=[x/100.0 for x in range(1,100)])
,并提供:
Could not find exact threshold 0.01; using closest threshold found 0.010396965719556233.
Could not find exact threshold 0.02; using closest threshold found 0.016617060110009896.
...
Could not find exact threshold 0.92; using closest threshold found 0.9469528904679438.
Could not find exact threshold 0.93; using closest threshold found 0.9469528904679438.
Could not find exact threshold 0.94; using closest threshold found 0.9469528904679438.
Could not find exact threshold 0.95; using closest threshold found 0.9469528904679438.
Could not find exact threshold 0.96; using closest threshold found 0.9469528904679438.
Could not find exact threshold 0.97; using closest threshold found 0.9760293572153097.
Could not find exact threshold 0.98; using closest threshold found 0.9787491606489236.
Could not find exact threshold 0.99; using closest threshold found 0.9909817370067531.
[[0.01, 1.0],
[0.02, 1.0],
[0.03, 1.0],
...
[0.87, 1.0],
[0.88, 1.0],
[0.89, 0.9850746268656716],
[0.9, 0.9850746268656716],
[0.91, 0.9850746268656716],
[0.92, 0.9850746268656716],
[0.93, 0.9850746268656716],
[0.94, 0.9850746268656716],
[0.95, 0.9850746268656716],
[0.96, 0.9850746268656716],
[0.97, 0.9701492537313433],
[0.98, 0.9552238805970149],
[0.99, 0.8955223880597015]]
(我得到了这样不寻常的输出,因为它完全了解了我的数据集 - 我怀疑你发生了什么事?)(我愚蠢地使我的二进制列直接起作用于其中一个输入列,没有噪音! )