在scikit学习管道[[功能选择] + [算法]]上应用gridsearch CV,但是会出现以下错误:

时间:2019-05-09 23:24:51

标签: python jupyter-notebook

我想在scikit-learn管道[[feature selection] + [algorithm]]上应用gridsearch CV,但它给出以下错误,我该如何纠正代码?

 from sklearn import svm
 from sklearn.model_selection import GridSearchCV
 from sklearn.pipeline import Pipeline
 from sklearn.feature_selection import SelectKBest
 from sklearn.feature_selection import SelectFromModel
 pipeline1 = Pipeline([ 
    ('feature_selection', SelectFromModel(svm.SVC(kernel='linear'))),
    ('filter'           , SelectKBest(k=11)),
    ('classification'   , svm.SVC(kernel='linear'))
                ])
 grid_parameters_tune = 
      [{'estimator__C': [0.01, 0.1, 1.0, 10.0, 100.0, 1000.0]}]
 model = GridSearchCV(pipeline1, grid_parameters_tune, cv=5, n_jobs=-1, 
                   verbose=1)
 model.fit(X, y)


ValueError: Invalid parameter estimator for estimator Pipeline(memory=None,
steps=[('feature_union', FeatureUnion(n_jobs=None,
transformer_list=[('filter', SelectKBest(k=10, score_func=<function f_classif at 0x000001ECCBB3E840>)), ('feature_selection', SelectFromModel(estimator=SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape='ovr', ...r', max_iter=-1, probability=False, random_state=None,
shrinking=True, tol=0.001, verbose=False))]). Check the list of available parameters with `estimator.get_params().keys()`.

1 个答案:

答案 0 :(得分:0)

我认为错误是由您的location = /favicon.ico { access_log off; log_not_found off; } location = /robots.txt { access_log off; log_not_found off; } access_log off; error_log /var/log/nginx/mysite.com-error.log error; error_page 404 /index.php; location ~ \.php$ { fastcgi_split_path_info ^(.+\.php)(/.+)$; fastcgi_pass unix:/var/run/php/php7.3-fpm.sock; fastcgi_index index.php; include fastcgi_params; } location ~* \.(css|js|gif|png|ico|svg)$ { try_files $uri $uri/ /index.php?p=$uri&$args; expires 7d; } location ~* \.(jpg|jpeg)$ { expires 7d; } location ~ /\.ht { deny all; } location ~ /\.(?!well-known).* { deny all; } 中的名称引起的。您正在尝试访问grid_parameters_tune,但是管道中没有步骤名称estimator__C。重命名estimator应该可以解决问题。

如果您想通过classification__C中的SVC访问C参数,可以使用SelectFromModel

下面是一个包含随机数据的有效示例。为了节省一些时间,我从原始管道中更改了一些参数,不一定要直接复制它。

feature_selection__estimator__C

第二种方式:

import numpy as np
import pandas as pd
from sklearn import svm
from sklearn.feature_selection import SelectFromModel, SelectKBest
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline

X = pd.DataFrame(data=np.arange(1000).reshape(-1, 25))
y = np.random.binomial(1, 0.5, 1000//25)


pipeline1 = Pipeline(
    [
        ("feature_selection", SelectFromModel(svm.SVC(kernel="linear"))),
        ("filter", SelectKBest(k=11)),
        ("classification", svm.SVC(kernel="linear")),
    ]
)
grid_parameters_tune = [{"classification__C": [0.01, 0.1, 1.0, 10.0,]}]
model = GridSearchCV(pipeline1, grid_parameters_tune, cv=3, n_jobs=-1, verbose=1)
model.fit(X, y)