将object类型的pandas dataframe列转换为numpy数组

时间:2018-05-29 14:50:06

标签: python arrays pandas numpy dataframe

我有一个pandas数据框,用于保存图像ID,图像类和图像数据:

img_train.head(5)

   ID  index  class                                               data
0  10472  10472      0  [[[255, 255, 255, 0], [255, 255, 255, 0], [255...
1   7655   7655      0  [[[255, 255, 255, 0], [255, 255, 255, 0], [255...
2   6197   6197      0  [[[255, 255, 255, 0], [255, 255, 255, 0], [255...
3   9741   9741      0  [[[255, 255, 255, 0], [255, 255, 255, 0], [255...
4   9169   9169      0  [[[255, 255, 255, 0], [255, 255, 255, 0], [255...

我正在尝试将这些列中的每一列转换为numpy数组:

train_img_array = np.array([])
train_id_array = np.array([])
train_lab_array = np.array([])
count = 0
for index, row in img_train.iterrows():
    imgid = row['ID']
    imgclass = row['class']
    imgdata = row['data']
    #print(imgdata)
    train_img_array = np.append(train_img_array, imgdata )
    train_lab_array = np.append(train_lab_array, imgclass )
    train_id_array = np.append(train_id_array, imgid )

但是,保存图像数据且属于“对象”类型的列未被转换为numpy数组中的相应行。例如,这是从原始数据帧处理58行后每个numpy数组的形状:

train_img_array.shape
train_lab_array.shape
train_id_array.shape
(93615200,)
(58,)
(58,)

我该如何解决这个问题?

1 个答案:

答案 0 :(得分:-1)

我找到了这个问题的答案。这是非常直接的,我只是没有看到它开始。这就是我如何获取对象数据以及numpy数组(.values :))

train_img_array = np.array([])
train_id_array = np.array([])
train_lab_array = np.array([])
train_id_array = img_train['ID'].values
train_lab_array = img_train['class'].values
train_img_array =img_train['data'].values
#train_img_array = np.row_stack(img_train['data'])