如何将最佳参数(使用GridSearchCV)从管道传递到另一个管道

时间:2017-11-19 11:03:23

标签: python scikit-learn grid-search

我有一个自定义管道,我使用Sklearn的GridSearchCV来调整整个管道的参数。我使用sklearn进行了最好的参数组合,但我希望获得最佳参数组合并传递给另一个管道。

这是管道,

p = Pipeline([
    ('union', FeatureUnion(
        transformer_list=[
            ('chargram', Pipeline([
                ('tfidf', TfidfVectorizer(token_pattern=r'\w')),
                ('kbest', SelectPercentile(score_func=chi2)),
            ])),
            ('custom', Pipeline([
                ('features', CustomFeatures()),
                ('tfidf', TfidfVectorizer()),
                ('kbest', SelectPercentile(score_func=chi2)),
            ]))
        ],
        # weight components in FeatureUnion. Can be tuned
        transformer_weights={
            'chargram': 0.8,
            'custom': 0.8
        },
        n_jobs=-1
    )),

    # Classifier stage      
    (('clf', clf)),
])

所以,在这个管道中,我也得到了分类器的参数组合,但我想要做的就是获取featureunion步骤的参数并将其传递给管道并使用另一组分类器参数传递给featureunion - 两者结合。

有办法吗?

1 个答案:

答案 0 :(得分:0)

您可以在以下变量中保存参数值:

transformer_list = [
    ('chargram', Pipeline([
        ('tfidf', TfidfVectorizer(token_pattern=r'\w')),
        ('kbest', SelectPercentile(score_func=chi2)),
    ])),
    ('custom', Pipeline([
        ('features', CustomFeatures()),
        ('tfidf', TfidfVectorizer()),
        ('kbest', SelectPercentile(score_func=chi2)),
    ]))
]

transformer_weights = {
    'chargram': 0.8,
    'custom': 0.8
}

p = Pipeline([
    ('union', FeatureUnion(
        transformer_list=transformer_list,
        # weight components in FeatureUnion. Can be tuned
        transformer_weights=transformer_weights,
        n_jobs=-1
    )),

    # Classifier stage
    (('clf', clf)),
])
P.S:我不知道如果我理解你了!