Question

我正在研究＆＃34;使用支持向量机对收入数据进行分类＆＃34; 模型和代码如下转换 字符串数据到数字数据。

代码：

label_encoder =[]
X_encoded = np.empty(X.shape)
for i, item in enumerate(X[0]):
    if item.isdigit():
        X_encoded[:,i] = X[:,i]
    else:
        label_encoder.append(preprocessing.LabelEncoder())
        X_encoded[:,i] = label_encoder[-1].fit_transform(X[:,i])
X = X_encoded[:, :-1].astype(int) 
y = X_encoded[:, -1].astype(int)

Erorr：

<ipython-input-27-6393acaab006> in <module>()
      2 label_encoder =[]
      3 X_encoded = np.empty(X.shape)
----> 4 for i, item in enumerate(X[0]):
      5     if item.isdigit():
      6         X_encoded[:,i] = X[:,i]

**IndexError: index 0 is out of bounds for axis 0 with** size 0

Answer 1

如果您的数据集包含分类数据，则使用LabelEncoder类

对数据进行编码

from sklearn.preprocessing import LabelEncoder 
labelEncoder_X = LabelEncoder()
X[row,column] = labelEncoder_X.fit_transform(X[row,column])

它会将特定列的字符串数据转换为从0开始的数字数据

将字符串数据转换为数字数据

1 个答案: