我正在尝试将pandas df的非数字列编码为数值。我正在使用
df = df.fillna('0')
msk = np.random.rand(len(df)) < 0.8
df_train = df[msk]
df_test = df[~msk]
columns_to_encode = df.select_dtypes(exclude=[np.number]).columns
encoder_dict = {col: LabelEncoder() for col in columns_to_encode }
df_train_enc = df_train
df_test_enc = df_test
for col in columns_to_encode:
encoder_dict[col].fit_transform(df_train_enc[col])
然而,这会引发错误TypeError: '<' not supported between instances of 'str' and 'float'
。我在这里错过了什么?我认为LabelEncoder应该能够将字符串转换为数字...