在for循环中更新数据帧

时间:2018-09-20 11:16:43

标签: python pandas dataframe

我试图根据for循环中的值创建一个数据框:

d1 = {}
d = {}

for i in range(4000):
   try:
    shape_json = json.loads(region_shape[i])
    file_name = file_name_nuclei[i]

    x_val = shape_json["x"]
    y_val = shape_json["y"]
    width_val = shape_json["width"]
    height_val = shape_json["height"]

    path = '/home/values/' + str(file_name)

    x1 = x_val
    y1 = y_val

    x2 = x_val + width_val
    y2 = y_val + height_val

    df = pd.DataFrame(data=d1)

    d = {'col1': [path], 'col2': [x1], 'col3': [y1], 'col4': [x2], 'col5': [y2], 'col5': ['nucleus']}
    df2 = pd.DataFrame(data=d1)

    df.update(df2)

   except:
       pass

但是,我无法获得每次迭代都更新的数据框。有人可以帮忙吗?

我正在尝试获取输出:

0 col1 col2 col3 col4 col5
  '/home/values/image.png' 23 55 30 62 'nucleus'
  '/home/values/image2.png' 40 72 37 92 'nucleus'
.
.
.
.
.

2 个答案:

答案 0 :(得分:0)

您需要在循环外创建一个“主”数据框。

d1 = {}
d = {}

df = pd.DataFrame(data=d1)

for i in range(4000):
    try:
        shape_json = json.loads(region_shape[i])
    file_name = file_name_nuclei[i]

    x_val = shape_json["x"]
    y_val = shape_json["y"]
    width_val = shape_json["width"]
    height_val = shape_json["height"]

    path = '/home/values/' + str(file_name)

    x1 = x_val
    y1 = y_val

    x2 = x_val + width_val
    y2 = y_val + height_val

    d = {'col1': [path], 'col2': [x1], 'col3': [y1], 'col4': [x2], 'col5': [y2], 'col5': ['nucleus']}
    df2 = pd.DataFrame(data=d1)

    df.update(df2)



    except:
        pass

并且d1在所有代码中均为空。当您尝试使用df2更新df时,df2也为空。

答案 1 :(得分:0)

我将创建一个列表,并将所有数据帧添加到循环中的列表中,然后在最后使用pd.concat:

results = []

for i in range(4000):
    try:
        shape_json = json.loads(region_shape[i])
        file_name = file_name_nuclei[i]

        x_val = shape_json["x"]
        y_val = shape_json["y"]
        width_val = shape_json["width"]
        height_val = shape_json["height"]

        path = '/home/values/' + str(file_name)

        x1 = x_val
        y1 = y_val

        x2 = x_val + width_val
        y2 = y_val + height_val

        d = {'col1': [path], 'col2': [x1], 'col3': [y1], 'col4': [x2], 'col5': [y2], 'col5': ['nucleus']}
        df = pd.DataFrame(data=d)

        results.append(df) # append this loop's df to your list of dataframes
    except:
        pass

final_df = pd.concat(results)