我有一个包含分类数据(例如,性别)的CSV文件,我试图将其转换为布尔格式(男性= 1,女性= 2等),我有以下代码:
from sklearn.feature_extraction import DictVectorizer
v = DictVectorizer(sparse = True)
D = [ { 'k': 1, 'gender': 'Female' },
{ 'k': 2, 'gender': 'Male' },
{ 'k': 3, 'gender': 'NULL' } ]
X = v.fit_transform(D)
虽然这很好用,但我无法在特定列的每个必需单元格上进行迭代(例如,在这种情况下为性别)。数据如下所示:
patient_id DAY_NUMBER_IN_MONTH race gender marital_status
11511 20 Other Male Unknown
9882613 25 Unknown Female Unknown
32190339 13 Caucasian Female Married
32190339 13 Caucasian Female Married
...