熊猫对json的multiindex

时间:2018-11-10 06:32:50

标签: python python-3.x pandas pandas-groupby multi-index

我的数据框如下

    name address.office.street address.office.city address.office.country  address.office.postcode location.tracker.id
0  Name1               Street1               City1               Country1                      100         9,99,02,129
1  Name2               Street2               City2               Country2                      200         1,91,95,129
2  Name3               Street3               City3               Country3                      300           99,90,259

我拆分列并按如下所示创建MultiIndex

idx = df.columns.str.split('.', expand=True)
df.columns = idx
df[('location', 'tracker', 'id')] = df[('location', 'tracker', 'id')].str.split(',')
print(df)


    name  address                                    location
     NaN   office                                     tracker
     NaN   street   city   country postcode                id
0  Name1  Street1  City1  Country1      100  [9, 99, 02, 129]
1  Name2  Street2  City2  Country2      200  [1, 91, 95, 129]
2  Name3  Street3  City3  Country3      300     [99, 90, 259]

我想将其转换为嵌套的json。我是否可以知道将其转换为json以下的熊猫的方法。

  

[{       “ name”:“ Name1”,       “地址”: {           “办公室”:{               “ street”:“ Street1”,               “ city”:“ City1”,               “ country”:“ Country1”,               “邮政编码”:100           }       },       “位置”: {           “跟踪器”:{               “ID”: [                   “ 9”,                   “ 99”,                   “ 02”,                   “ 129”               ]           }       }},{       “ name”:“ Name2”,       “地址”: {           “办公室”:{               “ street”:“ Street2”,               “ city”:“ City2”,               “ country”:“ Country2”,               “邮政编码”:200           }       },       “位置”: {           “跟踪器”:{               “ID”: [                   “ 1”,                   “ 91”,                   “ 95”,                   “ 129”               ]           }       }},{       “ name”:“ Name3”,       “地址”: {           “办公室”:{               “ street”:“ Street3”,               “ city”:“ City3”,               “ country”:“ Country3”,               “邮政编码”:300           }       },       “位置”: {           “跟踪器”:{               “ID”: [                   “ 99”,                   “ 90”,                   “ 259”               ]           }       }}]

虽然我可以用下面的代码获得以上结果,但是当记录数(df.shape [0])很大时,它就会变慢。

nested_dict = lambda: defaultdict(nested_dict)
result = nested_dict()

result_list = []
for cntr in range(df.shape[0]):
    for i, j in df.iteritems():
        value = j[cntr]
        if not pd.isnull(i[2]):
            result[i[0]][i[1]][i[2]] = value
        elif not pd.isnull(i[1]):
            result[i[0]][i[1]] = value
        elif not pd.isnull(i[0]):
            result[i[0]] = value

    result_list.append(deepcopy(result))

print(json.dumps(result_list, indent=4))

我希望简化类似于

(df.groupby(level=['level0']).apply(lambda df: df.xs(df.name))).to_json()

但是,无法获得预期的结果。

0 个答案:

没有答案