Question

我使用Pipeline

的RandomizedSearchCV对象

pipe_sgd = Pipeline([('scl', StandardScaler()),
                    ('clf', SGDClassifier(n_jobs=-1))])

param_dist_sgd = {'clf__loss': ['log'],
                 'clf__penalty': [None, 'l1', 'l2', 'elasticnet'],
                 'clf__alpha': np.linspace(0.15, 0.35),
                 'clf__n_iter': [3, 5, 7]}

sgd_randomized_pipe = RandomizedSearchCV(estimator = pipe_sgd, 
                                         param_distributions=param_dist_sgd, 
                                         cv=3, n_iter=30, n_jobs=-1)

sgd_randomized_pipe.fit(X_train, y_train)

我想访问coef_的{{1}}属性，但我无法做到这一点。我尝试使用以下代码访问best_estimator_。

coef_

但是我得到以下AttributeError ...

AttributeError：＆＃39; Pipeline＆＃39;对象没有属性＆＃39; coef _＆＃39;

scikit-learn文档说sgd_randomized_pipe.best_estimator_.coef_是coef_的属性，它是我SGDClassifier的类。

我做错了什么？

Answer 1

您可以使用named_steps dict在制作管道时始终使用您为其指定的名称。

scaler = sgd_randomized_pipe.best_estimator_.named_steps['scl']
classifier = sgd_randomized_pipe.best_estimator_.named_steps['clf']

然后访问coef_，intercept_等所有可用于相应拟合估算的属性。

这是管道显示为specified in the documentation的正式属性：

named_steps ：dict

只读属性，用于按用户名称访问任何步骤参数。键是步骤名称，值是步骤参数。

Answer 2

我发现一种方法是使用steps属性进行链式索引...

sgd_randomized_pipe.best_estimator_.steps[1][1].coef_

这是最佳做法，还是有另一种方式？

Answer 3

我认为这应该可行：

sgd_randomized_pipe.named_steps['clf'].coef_

从sklearn中的Pipeline对象返回系数

3 个答案: