Question

我有一个JSON文件，每行如下所示：

{
   "id": {
      "val": "dkjbskjb",
      "type": "cookie"
   },
   "country": "US",
   "region": "Blank",
   "events": [
      {
         "tap": "Device",
         "c": 98678,
         "ts": 12988685,
         "remove": [
            12,
            13
         ]
      }
   ]
}

我应该如何在python中解析它并将其保存到带有列的数据框中：

id，值，类型，国家/地区，事件？
如何从事件中为其嵌套列表创建列？

Answer 1

def flatten_json(y):
out = {}

def flatten(x, name=''):
    if type(x) is dict:
        for a in x:
            flatten(x[a], name + a + '_')
    elif type(x) is list:
        i = 0
        for a in x:
            flatten(a, name + str(i) + '_')
            i += 1
    else:
        out[name[:-1]] = x

flatten(y)
return out

然后

jsonObj = json.loads(behavior_s3['mess'][0])
flat = flatten_json(jsonObj)
results = pd.DataFrame()
special_cols = []

columns_list = list(flat.keys())
for item in columns_list:
   try:
    row_idx = re.findall(r'\_(\d+)\_', item )[0]
  except:
       special_cols.append(item)
       continue
 column = re.findall(r'\_\d+\_(.*)', item )[0]
 column = column.replace('_', '')

 row_idx = int(row_idx)
 value = flat[item]

 results.loc[row_idx, column] = value

for item in special_cols:
   results[item] = flat[item]

如何将嵌套字典解析为数据框？

1 个答案:

然后