我正在为自己的lil项目尝试使用管道+ standardscaler + OHE + CLF + GridSearchCV + ColumnTranformer进行一些数据建模。
我期望我的代码可以正常运行,除非不能正常运行。
Fitting 10 folds for each of 36 candidates, totalling 360 fits
[CV] clf__C=0.0001, clf__kernel=rbf, reduce_dims__n_components=4 .....
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-50-c52ff4770002> in <module>()
1 grid = GridSearchCV(clf, param_grid=param_grid, cv=10, n_jobs=1, verbose=2, scoring= 'accuracy')
----> 2 grid.fit(X, y)
3 print(grid.best_score_)
4 print(grid.cv_results_)
13 frames
/usr/local/lib/python3.6/dist-packages/sklearn/base.py in set_params(self, **params)
234 'Check the list of available parameters '
235 'with `estimator.get_params().keys()`.' %
--> 236 (key, self))
237
238 if delim:
ValueError: Invalid parameter clf for estimator Pipeline(memory=None,
steps=[('preprocessor',
ColumnTransformer(n_jobs=None, remainder='drop',
sparse_threshold=0.3,
transformer_weights=None,
transformers=[('num',
Pipeline(memory=None,
steps=[('scale',
StandardScaler(copy=True,
with_mean=True,
with_std=True)),
('reduce_dims',
PCA(copy=True,
iterated_power='auto',
n_components=4,
random_state=None,
svd_solver='aut...
Pipeline(memory=None,
steps=[('onehot',
OneHotEncoder(categories='auto',
drop=None,
dtype=<class 'numpy.float64'>,
handle_unknown='ignore',
sparse=True))],
verbose=False),
['Method', 'Regionname',
'Type'])],
verbose=False)),
('SVR',
SVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.1,
gamma='scale', kernel='rbf', max_iter=-1, shrinking=True,
tol=0.001, verbose=False))],
verbose=False). Check the list of available parameters with `estimator.get_params().keys()`.
我已经尝试了sklearn网站上的用户指南,但是无论我多么努力,它仍然会弹出与上面显示的相同的错误。
X = Melbourne_housing[['Bathroom', 'Method', 'Regionname', 'Rooms', 'Type']]
y = Melbourne_housing[['Price']]
from sklearn.decomposition import PCA
from sklearn import svm, datasets
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVR
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
numeric_features = ['Bathroom','Rooms','Price']
numeric_transformer = Pipeline(steps=[
('scale', StandardScaler()),
('reduce_dims', PCA(n_components=4))
])
categorical_features = ['Method', 'Regionname','Type']
categorical_transformer = Pipeline(steps=[
('onehot', OneHotEncoder(handle_unknown='ignore'))])
param_grid = dict(reduce_dims__n_components=[4,6,8],
clf__C=np.logspace(-4, 1, 6),
clf__kernel=['rbf','linear'])
preprocessor = ColumnTransformer(
transformers=[
('num', numeric_transformer, numeric_features),
('cat', categorical_transformer, categorical_features)])
# Append classifier to preprocessing pipeline.
# Now we have a full prediction pipeline.
clf = Pipeline(steps=[('preprocessor', preprocessor),
('SVR', SVR())])
grid = GridSearchCV(clf, param_grid=param_grid, cv=10, n_jobs=1, verbose=2, scoring= 'accuracy')
grid.fit(X, y)
print(grid.best_score_)
print(grid.cv_results_)
我个人是python和机器学习的新手(仅几个月的经验),所以我真的需要您的帮助。