H2O4GPU和Scikit-Learn之间的分类分数有所不同

时间:2018-11-13 22:42:25

标签: python scikit-learn random-forest h2o h2o4gpu

我已经开始使用精度和召回率评估随机森林分类器。但是,尽管分类器的CPU和GPU实现的训练集和测试集相同,但我看到返回的评估分数有所不同。这是库中的已知错误吗?

下面两个代码示例供参考。

Scikit-Learn(CPU)

from sklearn.metrics import recall_score, precision_score
from sklearn.ensemble import RandomForestClassifier

rf_cpu = RandomForestClassifier(n_estimators=5000, n_jobs=-1)
rf_cpu.fit(X_train, y_train)
rf_cpu_pred = clf.predict(X_test)

recall_score(rf_cpu_pred, y_test)
precision_score(rf_cpu_pred, y_test)

CPU Recall: 0.807186
CPU Precision: 0.82095

H2O4GPU(GPU)

from h2o4gpu.metrics import recall_score, precision_score
from h2o4gpu import RandomForestClassifier

rf_gpu = RandomForestClassifier(n_estimators=5000, n_gpus=1)
rf_gpu.fit(X_train, y_train)
rf_gpu_pred = clf.predict(X_test)

recall_score(rf_gpu_pred, y_test)
precision_score(rf_gpu_pred, y_test)

GPU Recall: 0.714286
GPU Precision: 0.809988

1 个答案:

答案 0 :(得分:0)

更正:发现精确度和召回率的输入顺序错误。根据Scikit-Learn documentation,顺序始终为(y_true, y_pred)

正确的评估代码

recall_score(y_test, rf_gpu_pred)
precision_score(y_test, rf_gpu_pred)