如何在Scikit-Learn中自定义记分功能?

时间:2015-09-20 10:12:27

标签: python-3.x scikit-learn

我为Scikit-Learn,Python建立了自己的评分功能。我利用" make_scorer"就像在Documentation-Users Guide中解释的那样。但它没有用。

我的电脑仍然忙着#34;并且它不会返回任何输出。

我的代码如下。谁能帮我吗?我被困了几个星期......

非常感谢!

from sklearn.metrics import make_scorer
from sklearn.svm import SVR
from sklearn.grid_search import GridSearchCV
import numpy as np

def my_custom_loss_func(real, predictions):
    error = 0
    for i in range(0, len(real)):
        z = (real[i] - M)
        if predictions[i] > M and real[i] > M and (predictions[i] - real[i]) > 0:
            error_i = (abs(real[i] - predictions[i]))**(2*np.exp(z))
        if predictions[i] > M and real[i] > M and (predictions[i] - real[i]) < 0:
            error_i = -(abs((real[i] - predictions[i]))**(2*np.exp(z)))
        if predictions[i] > M and real[i] < M:
            error_i = -(abs(real[i] - predictions[i]))**(2*np.exp(-z))
        if predictions[i] < M and real[i] < M and (predictions[i] - real[i]) > 0:
            error_i = (abs(real[i] - predictions[i]))**(2*np.exp(z))
        if predictions[i] < M and real[i] < M and (predictions[i] - real[i]) < 0:
            error_i = -(abs((real[i] - predictions[i]))**(2*np.exp(z)))
        if predictions[i] < M and real[i] > M:
            error_i = -(abs((real[i] - predictions[i]))**(2*np.exp(-z)))
        error += error_i
    return error


M = 0.5 
loss = make_scorer(my_custom_loss_func, greater_is_better=False)
y_true = [[0.52], [0.54], [0.56], [0.48], [0.44], [0.46]]
y_pred = [0.54, 0.52, 0.56, 0.51, 0.42, 0.48]

C_range = np.logspace(-3.0, 3.0, 12, base=2.0)
gamma_range = np.logspace(-3, 3, 12, base=2.0)
epsilon_range = np.logspace(-3, 3, 12, base=2.0)
tuned_parameters = {'kernel':['rbf'], 'C':C_range, 'gamma':gamma_range, 
                    'epsilon':epsilon_range, 'cache_size':[3000]}

svm = SVR()
svm_regression = GridSearchCV(svm, tuned_parameters, 
                              scoring=my_custom_loss_func, n_jobs=-1, cv=3)
sv_r = svm_regression.fit(y_true, y_pred)
print(svm_regression.score(y_true, y_pred))

1 个答案:

答案 0 :(得分:0)

我在使用自定义评分功能时遇到了同样的问题(python 3.6.4scikit-learn 0.19.1,windows,jupyter notebook)。看起来它是Scikit-Learning中的一个错误。请参阅仍未解决的问题https://github.com/scikit-learn/scikit-learn/issues/2889

设置n_jobs=1适用于自定义分数功能,但n_jobs=4失败。我的CPU有4个内核,运行xgboost时,n_jobs=1的计算时间等于默认分数函数的n_jobs=4(例如scoring='neg_mean_absolute_error')。