扁平化深度嵌套的JSON垂直转换为熊猫

时间:2020-01-02 18:25:23

标签: python pandas

嗨,我正在尝试拼合JSON文件,但无法进行。我的JSON有3个缩进重复如下所示的示例

floors": [
        {
            "uuid": "8474",
            "name": "some value",
            "areas": [
                {
                    "uuid": "xyz",
                    "**name**": "qwe",
                    "roomType": "Name1",
                    "templateUuid": "sdklfj",
                    "templateName": "asdf",
                    "templateVersion": "2.7.1",
                    "Required1": [
                        {
                            "**uuid**": "asdf",
                            "description": "asdf3",
                            "categoryName": "asdf",
                            "familyName": "asdf",
                            "productName": "asdf3",
                            "Required2": [
                                {
                                    "**deviceId**": "asdf",
                                    "**deviceUuid**": "asdf-asdf"
                                }
                            ]
                        }

我要在区域中嵌套“ Required1”中的相应值,并为“ Required1”中的对应的必需值2(在**中突出显示) 我已经尝试过按以下方式对JSON进行规范化,但是失败了,还有其他免费库:

尝试:

from pprint import pprint
with open('Filename.json') as data_file:
    data_item = json.load(data_file)
Raw_Areas=json_normalize(data_item['floors'],'areas',errors='ignore',record_prefix='Area_')

没有显示区域值。只需要1需要2仍然嵌套

K=json_normalize(data_item['floors'][0],record_path=['Required1','Required2'],errors='ignore',record_prefix='Try_')

from flatten_json import flatten_json
Flat_J1= pd.DataFrame([flatten_json(data_item)]) 

希望获得如下值: 预期栏数: floor.areas.Required1.Required2.deviceUuid floor.areas.name (并排)

请帮助我在尝试中遗漏任何东西。我对JSON加载还很陌生。

1 个答案:

答案 0 :(得分:1)

假设使用以下JSON(正如许多人指出的那样,它是不完整的)。因此,我根据您的支架开口完成了此操作。

dct = {"floors": [
        {
            "uuid": "8474",
            "name": "some value",
            "areas": [
                {
                    "uuid": "xyz",
                    "name": "qwe",
                    "roomType": "Name1",
                    "templateUuid": "sdklfj",
                    "templateName": "asdf",
                    "templateVersion": "2.7.1",
                    "Required1": [
                        {
                            "uuid": "asdf",
                            "description": "asdf3",
                            "categoryName": "asdf",
                            "familyName": "asdf",
                            "productName": "asdf3",
                            "Required2": [
                                {
                                    "deviceId": "asdf",
                                    "deviceUuid": "asdf-asdf"
                                }
                            ]
                        }
                    ]
                }
            ]
        }
]}

您可以执行以下操作(要求熊猫0.25.0)

df = pd.io.json.json_normalize(
    dct, record_path=['floors','areas', 'Required1'],meta=[['floors', 'areas', 'name']])
df = df.explode('Required2')
df = pd.concat([df, df["Required2"].apply(pd.Series)], axis=1)
df = df[['floors.areas.name', 'uuid', 'deviceId', 'deviceUuid']]

哪个给

>>>     floors.areas.name   uuid    deviceId    deviceUuid
>>> 0   qwe asdf    asdf    asdf-asdf