将数据框转换为python中的嵌套json

时间:2020-03-03 13:02:11

标签: python json pandas

我正在尝试使用以下代码将df转换为嵌套json:

nested_json = (df.groupby(['prediction_probability','id','ts','prediction_value'], as_index=False)
             .apply(lambda x:x[[
                "first_create_date",  
                "create_date",
                "update_timestamp",
                 "revenue",
                 "col",
                 "x"]].to_dict('r'))
             .reset_index()
             .rename(columns={0:'features'})
             .to_json(orient='records'))

我的问题是用方括号包裹的嵌套字典(key ='features')。 如何避免使用方括号?我知道我可以将输出视为字符串并替换方括号,但是当然,这是一个不好的做法

输出:

[
    {
        "pred": 0.50726,
        "id": "0030X00002qMwFrQAKxxxx",
        "ts": "2020-02-19T20:32:15.016586",
        "value": "A",
        "features": [
            {
                "first_create_date": 1582089665000,
                "create_date": 1582089665000,
                "update_timestamp": 1582142462000,
                "revenue": null,
                "col":"aaaa",
                "x": null
            }
        ]
    },
    {
        "pred": 0.50895,
        "id": "0030X00002qMvfHQASxxxxx",
        "ts": "2020-02-19T20:32:15.016586",
        "value": "A",
        "features": [
            {
                "first_create_date": 1582077985000,
                "create_date": 1582077985000,
                "update_timestamp": 1582142462000,
                "revenue": null,
                "col":"aaaa",
                "x": null
            }
        ]
    }
]

所需的输出:

[
    {
        "pred": 0.50726,
        "id": "0030X00002qMwFrQAKxxxx",
        "ts": "2020-02-19T20:32:15.016586",
        "value": "A",
        "features": 
            {
                "first_create_date": 1582089665000,
                "create_date": 1582089665000,
                "update_timestamp": 1582142462000,
                "revenue": null,
                "col":"aaaa",
                "x": null
            }

    },
    {
        "pred": 0.50895,
        "id": "0030X00002qMvfHQASxxxxx",
        "ts": "2020-02-19T20:32:15.016586",
        "value": "A",
        "features": 
            {
                "first_create_date": 1582077985000,
                "create_date": 1582077985000,
                "update_timestamp": 1582142462000,
                "revenue": null,
                "col":"aaaa",
                "x": null
            }

    }
]

1 个答案:

答案 0 :(得分:0)

简单的dict理解将达到目的: 假设您可以到达形状类似于您的输出的嵌套json,并将其命名为output。然后,要获得所需的输出,唯一要做的就是获取features列表的第一个元素:

desired_output = [{k: v if k!='features' else v[0]} for x in output for k,v in x.items()]