分解包含数字的非嵌套列表列

时间:2021-03-24 11:55:58

标签: python json pandas flatten

我有一个 Python 脚本,它使用一个函数来展平 JSON 对象:

from pandas import json_normalize

def flatten(nested_json, exclude=['']): 
    out = {}
    def flatten(x, name='', exclude=exclude):
        if type(x) is dict:
            for a in x:
                if a not in exclude: flatten(x[a], name + a + '_')
        elif type(x) is list:
            i = 0
            for a in x:
                flatten(a, name + str(i) + '_')
                i += 1
        else:
            out[name[:-1]] = x

    flatten(nested_json)
    return out

x = [{ 
    "_id" : 1, 
    "labelId" : [
        6422,3421
    ], 
    "levels" : [
        {
            "active" : "true", 
            "level" : 3, 
            "actions" : [
                {
                    "isActive" : "true"
                }]
        }]
}
,
{ 
    "_id" : 2, 
    "labelId" : [
        57,78
    ], 
    "levels" : [
        {
            "active" : "true", 
            "level" : 4, 
            "actions" : [
                {
                    "isActive" : "true"
                }]
        }]
}

]

flatJSON = [flatten(i) for i in x]
flatJSON = json_normalize(flatJSON )
print(flatJSON)

输出:

  _id  labelId_0  labelId_1 levels_0_active  levels_0_level levels_0_actions_0_isActive
0    1       6422       3421            true               3                        true
1    2         57         78            true               3                        true

所以我遇到的问题是这个函数将所有东西都压平到 0 级,包括列表,这意味着它会创建大量不必要的列。

有没有办法修改这个函数,让它在只包含整数/数字的情况下简单地分解一个列表,而不是将它展平成单独的列?

期望输出:

   _id  labelId_0   levels_0_active  levels_0_level levels_0_actions_0_isActive
0    1       6422              true               3                        true
1    1       3421              true               3                        true
2    2         57              true               4                        true
3    2         78              true               4                        true

0 个答案:

没有答案