如何将嵌套字典解析为数据框?

时间:2020-01-08 16:19:33

标签: python json dataframe

我有一个JSON文件,每行如下所示:

{
   "id": {
      "val": "dkjbskjb",
      "type": "cookie"
   },
   "country": "US",
   "region": "Blank",
   "events": [
      {
         "tap": "Device",
         "c": 98678,
         "ts": 12988685,
         "remove": [
            12,
            13
         ]
      }
   ]
}

我应该如何在python中解析它并将其保存到带有列的数据框中:

  1. id,值,类型,国家/地区,事件?
  2. 如何从事件中为其嵌套列表创建列?

1 个答案:

答案 0 :(得分:0)

def flatten_json(y):
out = {}

def flatten(x, name=''):
    if type(x) is dict:
        for a in x:
            flatten(x[a], name + a + '_')
    elif type(x) is list:
        i = 0
        for a in x:
            flatten(a, name + str(i) + '_')
            i += 1
    else:
        out[name[:-1]] = x

flatten(y)
return out

然后

jsonObj = json.loads(behavior_s3['mess'][0])
flat = flatten_json(jsonObj)
results = pd.DataFrame()
special_cols = []

columns_list = list(flat.keys())
for item in columns_list:
   try:
    row_idx = re.findall(r'\_(\d+)\_', item )[0]
  except:
       special_cols.append(item)
       continue
 column = re.findall(r'\_\d+\_(.*)', item )[0]
 column = column.replace('_', '')

 row_idx = int(row_idx)
 value = flat[item]

 results.loc[row_idx, column] = value

for item in special_cols:
   results[item] = flat[item]