我正在从具有多个行和列的csv文件中的特定列(左,上,长度和宽度)中提取所有整数值。我曾经用熊猫来隔离我感兴趣的列,但是我坚持如何使用数组的特定部分。
让我解释一下:我需要使用CSV文件的具有“ left,top,length和width”属性的列,然后才能获得xmin,ymin,xmax和ymax(这些都是由图像中的框协调的)。此列中的行示例如下所示:
[{"left":171,"top":0,"width":163,"height":137,"label":"styrofoam container"},{"left":222,"top":42,"width":45,"height":70,"label":"chopstick"}]
我需要提取171、0、163和137来进行查找xmax,xmin,ymax和ymin所需的操作
以上一行是我的pandas数组中的一行,如何提取运行操作所需的数字?
这是我编写的用于提取列的代码,这是我到目前为止的内容:
import os
import csv
import pandas
import numpy as np
csvPath = "/path/of/my/csvfile/csvfile.csv"
data = pandas.read_csv(csvPath)
csv_coords = data['Answer.annotation_data'].values #column with the coordinates
image_name = data ['Input.image_url'].values
print csv_coords[2]
答案 0 :(得分:1)
使用:
import ast
d = {'Answer.annotation_data': ['[{"left":171,"top":0,"width":163,"height":137,"label":"styrofoam container"},{"left":222,"top":42,"width":45,"height":70,"label":"chopstick"}]',
'[{"left":170,"top":10,"width":173,"height":157,"label":"styrofoam container"},{"left":222,"top":42,"width":45,"height":70,"label":"chopstick"}]']}
df = pd.DataFrame(d)
print (df)
Answer.annotation_data
0 [{"left":171,"top":0,"width":163,"height":137,...
1 [{"left":170,"top":10,"width":173,"height":157...
#convert string data to list of dicts if necessary
df['Answer.annotation_data'] = df['Answer.annotation_data'].apply(ast.literal_eval)
对于cols
的每个值,提取dict
的值并返回DataFrame
,最后通过concat
合并在一起:
def get_val(val):
comb = [[y.get(val, np.nan) for y in x] for x in df['Answer.annotation_data']]
return pd.DataFrame(comb).add_prefix('{}_'.format(val))
cols = ['left','top','width','height']
df1 = pd.concat([get_val(x) for x in cols], axis=1)
print (df1)
left_0 left_1 top_0 top_1 width_0 width_1 height_0 height_1
0 171 222 0 42 163 45 137 70
1 170 222 10 42 173 45 157 70
答案 1 :(得分:0)
要访问DataFrame
`data.loc[row][column]` or `data.loc[row,column]`
例如
`data.loc[0]['left']
要查找,例如全局top
中的最小值
min(data['top'])