我对此还很陌生,还有很多东西要学习,所以在我学习如何做得很好的同时请多多包涵。我使用了Kaggle数据集,并希望使用管道使代码更整洁,并简化清理分类和数字数据的过程。在尝试最后拟合模型之前,我的工作没有任何错误,这给了我ValueError:太多的值无法解包(预期3)。任何帮助将不胜感激!
我浏览了其他与我类似的问题,但是没有找到与我所遇到的问题特别相关的问题。我以为它可能与我的变量命名有关,但是到目前为止,我还没有发现我的错误。
df = pd.read_csv('master.csv')
df = df.drop('country-year', axis=1)
df[' gdp_for_year ($) '] = df[' gdp_for_year ($) '].str.replace(',','')
df[' gdp_for_year ($) '] = df[' gdp_for_year ($) '].astype(str).astype(float)
#print(df.info())
y = df.suicides_no
features = ['country', 'sex', 'age', 'generation', 'year', 'population',
'HDI for year', ' gdp_for_year ($) ', 'gdp_per_capita ($)']
X = df[features].copy()
X_train, X_valid, y_train, y_valid = train_test_split(X, y, train_size=0.8, test_size=0.2,random_state=0)
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import OneHotEncoder
numerical_transformer = SimpleImputer(strategy='mean')
categorical_transformer = Pipeline(steps=[
('imputer', SimpleImputer(strategy='most_frequent')),
('onehot', OneHotEncoder(handle_unknown='ignore'))
])
preprocessor = ColumnTransformer(
transformers=[
('num', numerical_transformer, 'suicides_no', 'population', 'HDI for year', ' gdp_for_year ($) ', 'gdp_per_capita ($)'),
('cat', categorical_transformer, 'country', 'sex', 'generation', 'year', 'age')
])
model_1 = RandomForestRegressor(n_estimators=100, random_state=0)
my_pipeline = Pipeline(steps=[('preprocessor', preprocessor),
('model', model_1)
])
my_pipeline.fit(X_train, y_train)
Traceback (most recent call last):
File "...Suicide Rates/SR.py", line 86, in <module>
my_pipeline.fit(X_train, y_train)
File "...pipeline.py", line 265, in fit
Xt, fit_params = self._fit(X, y, **fit_params)
File "...pipeline.py", line 230, in _fit
**fit_params_steps[name])
File "...memory.py", line 342, in __call__
return self.func(*args, **kwargs)
File "...pipeline.py", line 614, in _fit_transform_one
res = transformer.fit_transform(X, y, **fit_params)
File "...compose\_column_transformer.py", line 445, in fit_transform
self._validate_transformers()
File "...ompose\_column_transformer.py", line 256, in _validate_transformers
names, transformers, _ = zip(*self.transformers)
ValueError: too many values to unpack (expected 3)