Question

“我有一个JSON文件，其中包含多个条目，例如：

{
    "id": "01",
    "alpha_values": [
        {
            "val1": "1234",
            "val2": "5678",
            "bravo_values": [
                {
                    "val3": "ab_987",
                    "val4": "wd_123",
                }
            ]
        }
    ]
}

我正在将文件读取到成功的熊猫数据框中

import json
from pandas.io.json import json_normalize


file = "my.json"
with open(file) as data_file:
    data = json.load(data_file)
df = json_normalize(data)
print(df)

但是，仅当我真正需要每个值都在其自己的列中时，结果才看起来像两列。

当前结果

id                            alpha_values
1  [{'val1': '1234', 'val2': '5678', bravo_values[{'val3':'ab_987', 'val4': 'wd_123'}]}]

所需结果

id     val1     val2      val3     val4 
1     '1234'   '5678'   'ab_987'  'wd_123'

建议？

Answer 1

它应该工作：

data是您嵌套的json数据。

from pandas.io.json import json_normalize
pd.io.json.json_normalize(data)

Answer 2

首先平铺json：

def reshape(blob, final={}):
    for k,v in blob.items():
        if not isinstance(v, list):
            final.update({k: v})
        else:
            for item in v:
                reshape(item, final)

    return final

c = reshape(d)

df = pd.DataFrame([c])

   id  val1  val2    val3    val4
0  01  1234  5678  ab_987  wd_123

将嵌套的JSON转换为pandas数据框

2 个答案: