以下代码用于对指定的列(功能)进行OneHotEncode。我有54个功能,并且我想对所有功能进行编码,但是由于某种原因,我可以编码的最大功能数量是25,如果我增加了要编码的功能数量,.fit_transorm()将什么也不返回。>
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras.callbacks import TensorBoard
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformer
# ======================== 1 - Importing the data ========================
# - Dataset has 54 features and 1 label (55 columns)
# - 10k examples
datasetPath = "10k-States(0).csv"
dataset = pd.read_csv(datasetPath)
x_train = dataset.iloc[:, 0:54]
y_train = dataset.iloc[:, 54]
# ===================== 2 - Encode x (input) values ======================
# Columns to be encoded (should be 54, but 25 is max that works...)
cols_to_encode = list(range(25))
# 'categories' parameter is multiplied by same number as above,
# every feature has the same classes (labels)
transformer = ColumnTransformer(
[('one_hot_encoder', OneHotEncoder(categories=[[0,1,2,3,4,5]]*25), cols_to_encode)],
remainder='passthrough'
)
x = transformer.fit_transform(x_train)
这些都很好,但是只要我增加到26列或更多,x的值就是(),什么都没有。我不知道发生了什么...
答案 0 :(得分:0)
尝试使用此
columnnumberist = [] #insert here all the columns numbers
from sklearn.preprocessing import OneHotEncoder
one = OneHotEncoder(categorical_features = columnnumberlist) #Might get a deprecation warning
X = one.fit_transform(X)
X=X.toarray()