Question

我正在建立一个管道，在该管道中，我将一些分类特征转换为整数，然后再将数据输入LightGBM模型。在数据集中，我同时具有分类和非分类特征。

categorical_features = ['categorical', 'features']

category_pipeline = make_pipeline(SimpleImputer(strategy='constant', fill_value='NA', missing_values=None),
                                   OrdinalEncoder())

column_transformer = ColumnTransformer(transformers=[('cat', category_pipeline, categorical_features)],
                                      remainder='passthrough')


pipeline = make_pipeline(FunctionTransformer(prepare_dataset),
                         FunctionTransformer(encode_zip),
                         column_transformer,
                        LGBMClassifier)

尽管在OrdinalEncoder步骤中将所有分类特征都编码为整数，但是在调用pipeline.fit()时出现以下错误：

could not convert string to float

在我看来，在调用LGBClassifier的column_transformer方法之前根本没有执行fit。

有人遇到过这个问题吗？

ColumnTransformer在sklearn.pipeline中被跳过

0 个答案: