如何将“ odgt”(JSON)格式的注释转换为“ csv”格式

时间:2019-04-11 11:08:08

标签: python json tensorflow object-detection tensorflow-datasets

我正在用TensorFlow构建人识别算法,并想用CrowdHuman数据集训练我自己的算法,它们已经预先制作了注释,但格式为odgt(他们说这是JSON,但是当我更改扩展名时不起作用)。

我的问题是:如何使用这些注释来训练TensorFlow算法或如何将其转换为csv格式。

文件的每一行看起来像:

   {
      "ID": "284193,faa9000f2678b5e",
      "gtboxes": [
         {
            "tag": "person",
            "hbox": [
               123,
               129,
               63,
               64
            ],
            "head_attr": {
               "ignore": 0,
               "occ": 1,
               "unsure": 0
            },
            "fbox": [
               61,
               123,
               191,
               453
            ],
            "vbox": [
               62,
               126,
               154,
               446
            ],
            "extra": {
               "box_id": 0,
               "occ": 1
            }
         },
         {
            "tag": "person",
            "hbox": [
               214,
               97,
               58,
               74
            ],
            "head_attr": {
               "ignore": 0,
               "occ": 1,
               "unsure": 0
            },
            "fbox": [
               165,
               95,
               187,
               494
            ],
            "vbox": [
               175,
               95,
               140,
               487
            ],
            "extra": {
               "box_id": 1,
               "occ": 1
            }
         },
         {
            "tag": "person",
            "hbox": [
               318,
               109,
               58,
               68
            ],
            "head_attr": {
               "ignore": 0,
               "occ": 1,
               "unsure": 0
            },
            "fbox": [
               236,
               104,
               195,
               493
            ],
            "vbox": [
               260,
               106,
               170,
               487
            ],
            "extra": {
               "box_id": 2,
               "occ": 1
            }
         },
         {
            "tag": "person",
            "hbox": [
               486,
               119,
               61,
               74
            ],
            "head_attr": {
               "ignore": 0,
               "occ": 0,
               "unsure": 0
            },
            "fbox": [
               452,
               110,
               169,
               508
            ],
            "vbox": [
               455,
               113,
               141,
               501
            ],
            "extra": {
               "box_id": 3,
               "occ": 1
            }
         },
         {
            "tag": "person",
            "hbox": [
               559,
               105,
               53,
               57
            ],
            "head_attr": {
               "ignore": 0,
               "occ": 0,
               "unsure": 0
            },
            "fbox": [
               520,
               95,
               163,
               381
            ],
            "vbox": [
               553,
               98,
               70,
               118
            ],
            "extra": {
               "box_id": 4,
               "occ": 1
            }
         },
         {
            "tag": "person",
            "hbox": [
               596,
               40,
               72,
               83
            ],
            "head_attr": {
               "ignore": 0,
               "occ": 0,
               "unsure": 0
            },
            "fbox": [
               546,
               39,
               202,
               594
            ],
            "vbox": [
               556,
               39,
               171,
               588
            ],
            "extra": {
               "box_id": 5,
               "occ": 1
            }
         },
         {
            "tag": "person",
            "hbox": [
               731,
               139,
               69,
               83
            ],
            "head_attr": {
               "ignore": 0,
               "occ": 0,
               "unsure": 0
            },
            "fbox": [
               661,
               132,
               183,
               510
            ],
            "vbox": [
               661,
               132,
               183,
               510
            ],
            "extra": {
               "box_id": 6,
               "occ": 0
            }
         }
      ]
   }

谢谢您的帮助

1 个答案:

答案 0 :(得分:0)

关于您希望输出如何看起来还不太清楚,但是从您提供的内容来看,我想您可以遍历json中的项目,并使用json_normalize将其展平,然后将其附加到最终数据帧中以写入磁盘。像这样:

from pandas.io.json import json_normalize           
import json
import pandas as pd     

df = pd.DataFrame()
for each in data:

    temp_df = json_normalize(each['gtboxes'])
    temp_df ['ID'] = each['ID']

    df = df.append(temp_df).reset_index(drop=True)

df.to_csv('path/filename.csv', index=False)

输出:

print (df.to_string())
   extra.box_id  extra.occ                  fbox                hbox  head_attr.ignore  head_attr.occ  head_attr.unsure     tag                  vbox                      ID
0             0          1   [61, 123, 191, 453]  [123, 129, 63, 64]                 0              1                 0  person   [62, 126, 154, 446]  284193,faa9000f2678b5e
1             1          1   [165, 95, 187, 494]   [214, 97, 58, 74]                 0              1                 0  person   [175, 95, 140, 487]  284193,faa9000f2678b5e
2             2          1  [236, 104, 195, 493]  [318, 109, 58, 68]                 0              1                 0  person  [260, 106, 170, 487]  284193,faa9000f2678b5e
3             3          1  [452, 110, 169, 508]  [486, 119, 61, 74]                 0              0                 0  person  [455, 113, 141, 501]  284193,faa9000f2678b5e
4             4          1   [520, 95, 163, 381]  [559, 105, 53, 57]                 0              0                 0  person    [553, 98, 70, 118]  284193,faa9000f2678b5e
5             5          1   [546, 39, 202, 594]   [596, 40, 72, 83]                 0              0                 0  person   [556, 39, 171, 588]  284193,faa9000f2678b5e
6             6          0  [661, 132, 183, 510]  [731, 139, 69, 83]                 0              0                 0  person  [661, 132, 183, 510]  284193,faa9000f2678b5e