Question

我想在我的sklearn管道中添加目标变量转换器。通常，对于像PCA之类的操作或任何类型的回归分类器，sklearn支持CV的参数网格，例如：

        param_grid = [{
            "pca__n_components": [5, 10, 25, 50, 125, 250, 625, 1500, 3000],
            "rdf__n_estimators": n_estimators,
            "rdf__bootstrap": bootstrap,
            "rdf__max_depth": max_depth,
            "rdf__class_weight": class_weight}]

是否也可以将可变变压器添加到此网格？例如，我想先训练我的回归变量而不转换目标变量，然后再使用PowerTransformer()，我想缩放目标变量，并查看它是否可以改善我的结果。也可以将它们集成到参数网格中吗？

Answer 1

是的，可以将不同的转换器集成到您的param_grid词典中：

from sklearn.datasets import make_classification
from sklearn.preprocessing import PowerTransformer
from sklearn.model_selection import GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.svm import SVC

X, y = make_classification(random_state=0)
X_train, X_test, y_train, y_test = train_test_split(X, y,random_state=0)
pipe = Pipeline([('transformer', PowerTransformer()), ('svc', SVC())])

param_grid  = {"svc__C":[1, 10], "transformer":[PowerTransformer(), StandardScaler()]}

clf = GridSearchCV(pipe, param_grid )
clf.fit(X_train, y_train)

print(clf.best_params_)

将变压器添加到sklearn管道中以进行交叉验证

1 个答案: