我想在我的sklearn管道中添加目标变量转换器。通常,对于像PCA之类的操作或任何类型的回归分类器,sklearn支持CV的参数网格,例如:
param_grid = [{
"pca__n_components": [5, 10, 25, 50, 125, 250, 625, 1500, 3000],
"rdf__n_estimators": n_estimators,
"rdf__bootstrap": bootstrap,
"rdf__max_depth": max_depth,
"rdf__class_weight": class_weight}]
是否也可以将可变变压器添加到此网格?例如,我想先训练我的回归变量而不转换目标变量,然后再使用PowerTransformer()
,我想缩放目标变量,并查看它是否可以改善我的结果。也可以将它们集成到参数网格中吗?
答案 0 :(得分:2)
是的,可以将不同的转换器集成到您的param_grid词典中:
from sklearn.datasets import make_classification
from sklearn.preprocessing import PowerTransformer
from sklearn.model_selection import GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.svm import SVC
X, y = make_classification(random_state=0)
X_train, X_test, y_train, y_test = train_test_split(X, y,random_state=0)
pipe = Pipeline([('transformer', PowerTransformer()), ('svc', SVC())])
param_grid = {"svc__C":[1, 10], "transformer":[PowerTransformer(), StandardScaler()]}
clf = GridSearchCV(pipe, param_grid )
clf.fit(X_train, y_train)
print(clf.best_params_)