为IsolationForest GridSearchCV创建评分指标

时间:2017-11-07 08:03:25

标签: python python-2.7 scikit-learn

我正在做一个IsolationForest,我想使用GridSearchCV优化我的超参数。我希望我的得分基于异常值的召回分数,即label = -1。但是,我在运行此代码时遇到错误。

recall_fraud = make_scorer(recall_score(pos_label=-1))
gs_params ={
        'max_samples': [300,500,1000],
        'contamination': [float(y_train.count(-1))/len(y_train)] ,
        'max_features': [1,3,7],
        'n_estimators':[1000],
        'random_state':[1]
    }

isof_gs = GridSearchCV(IsolationForest(), gs_params, n_jobs = 1, verbose = 1, cv = 5, scoring = recall_fraud) 

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-273-d1f260f73f29> in <module>()
----> 1 recall_fraud = make_scorer(recall_score(pos_label=-1))
      2 gs_params ={
      3     'max_samples': [300,500,1000],
      4     'contamination': [float(y_train.count(-1))/len(y_train)] ,
      5     'max_features': [1,3,7],

TypeError: recall_score() takes at least 2 arguments (1 given)

我有什么不对劲吗?

1 个答案:

答案 0 :(得分:0)

使用make_scorer时,您要传递给评分函数的所有关键字参数都应传递给make_scorer,而不是内部评分函数。

查看kwargs param in make_scorer: -

  

** kwargs:附加参数要传递给score_func的附加参数。

更改您的代码:

# Updated recall_score() to recall_score
recall_fraud = make_scorer(recall_score, pos_label=-1)

并且它不会再出现错误。

在实际计算网格搜索的召回时,pos_label将自动转发到recall_score方法。