应用错误收集

我有一个数据集，其中包含以下列以及某些类别。像这样的东西。

Column1 = [red,blue,orange,green,red,red,blue,orange]

Column2 = [male, female, male, male, female, male]

我正在尝试使用XGBoost算法来预测一些数据，因此我必须对分类值进行编码。

如何使用sklearn一个热编码器来做到这一点？根据我在文档中看到的用法，其用法如下：

OneHotEncoder(n_values=None, categorical_features=None, categories=None, drop=None, sparse=True, dtype=<class ‘numpy.float64’>, handle_unknown=’error’)

据我了解，如果我想对column1进行编码，则必须使用

categorical_features = column1

categories = [red,blue,orange,green]

但是，如果我想对多个列进行热编码，该怎么做？

使用带有多个带有类别标签的sklearn热编码器

0 个答案: