我有一个自定义管道,我使用Sklearn的GridSearchCV来调整整个管道的参数。我使用sklearn进行了最好的参数组合,但我希望获得最佳参数组合并传递给另一个管道。
这是管道,
p = Pipeline([
('union', FeatureUnion(
transformer_list=[
('chargram', Pipeline([
('tfidf', TfidfVectorizer(token_pattern=r'\w')),
('kbest', SelectPercentile(score_func=chi2)),
])),
('custom', Pipeline([
('features', CustomFeatures()),
('tfidf', TfidfVectorizer()),
('kbest', SelectPercentile(score_func=chi2)),
]))
],
# weight components in FeatureUnion. Can be tuned
transformer_weights={
'chargram': 0.8,
'custom': 0.8
},
n_jobs=-1
)),
# Classifier stage
(('clf', clf)),
])
所以,在这个管道中,我也得到了分类器的参数组合,但我想要做的就是获取featureunion步骤的参数并将其传递给管道并使用另一组分类器参数传递给featureunion - 两者结合。
有办法吗?
答案 0 :(得分:0)
您可以在以下变量中保存参数值:
transformer_list = [
('chargram', Pipeline([
('tfidf', TfidfVectorizer(token_pattern=r'\w')),
('kbest', SelectPercentile(score_func=chi2)),
])),
('custom', Pipeline([
('features', CustomFeatures()),
('tfidf', TfidfVectorizer()),
('kbest', SelectPercentile(score_func=chi2)),
]))
]
transformer_weights = {
'chargram': 0.8,
'custom': 0.8
}
p = Pipeline([
('union', FeatureUnion(
transformer_list=transformer_list,
# weight components in FeatureUnion. Can be tuned
transformer_weights=transformer_weights,
n_jobs=-1
)),
# Classifier stage
(('clf', clf)),
])
P.S:我不知道如果我理解你了!