Question

我正在从具有多个行和列的csv文件中的特定列（左，上，长度和宽度）中提取所有整数值。我曾经用熊猫来隔离我感兴趣的列，但是我坚持如何使用数组的特定部分。

让我解释一下：我需要使用CSV文件的具有“ left，top，length和width”属性的列，然后才能获得xmin，ymin，xmax和ymax（这些都是由图像中的框协调的）。此列中的行示例如下所示：

[{"left":171,"top":0,"width":163,"height":137,"label":"styrofoam container"},{"left":222,"top":42,"width":45,"height":70,"label":"chopstick"}]

我需要提取171、0、163和137来进行查找xmax，xmin，ymax和ymin所需的操作

以上一行是我的pandas数组中的一行，如何提取运行操作所需的数字？

这是我编写的用于提取列的代码，这是我到目前为止的内容：

import os
import csv
import pandas
import numpy as np

csvPath = "/path/of/my/csvfile/csvfile.csv"

data = pandas.read_csv(csvPath)
csv_coords = data['Answer.annotation_data'].values #column with the coordinates
image_name = data ['Input.image_url'].values
print csv_coords[2]

Answer 1

使用：

import ast

d = {'Answer.annotation_data': ['[{"left":171,"top":0,"width":163,"height":137,"label":"styrofoam container"},{"left":222,"top":42,"width":45,"height":70,"label":"chopstick"}]',
                                '[{"left":170,"top":10,"width":173,"height":157,"label":"styrofoam container"},{"left":222,"top":42,"width":45,"height":70,"label":"chopstick"}]']}
df = pd.DataFrame(d)

print (df)
                              Answer.annotation_data
0  [{"left":171,"top":0,"width":163,"height":137,...
1  [{"left":170,"top":10,"width":173,"height":157...

#convert string data to list of dicts if necessary
df['Answer.annotation_data'] = df['Answer.annotation_data'].apply(ast.literal_eval)

对于cols的每个值，提取dict的值并返回DataFrame，最后通过concat合并在一起：

def get_val(val):
    comb = [[y.get(val, np.nan) for y in x] for x in df['Answer.annotation_data']]
    return pd.DataFrame(comb).add_prefix('{}_'.format(val))

cols = ['left','top','width','height']
df1 = pd.concat([get_val(x) for x in cols], axis=1)
print (df1)
   left_0  left_1  top_0  top_1  width_0  width_1  height_0  height_1
0     171     222      0     42      163       45       137        70
1     170     222     10     42      173       45       157        70

Answer 2

要访问DataFrame

中的一个字段

`data.loc[row][column]` or `data.loc[row,column]`

例如

`data.loc[0]['left']

要查找，例如全局top中的最小值

min(data['top'])

使用python提取数组元素的一部分

2 个答案: