我的数据框如下
name address.office.street address.office.city address.office.country address.office.postcode location.tracker.id
0 Name1 Street1 City1 Country1 100 9,99,02,129
1 Name2 Street2 City2 Country2 200 1,91,95,129
2 Name3 Street3 City3 Country3 300 99,90,259
我拆分列并按如下所示创建MultiIndex
idx = df.columns.str.split('.', expand=True)
df.columns = idx
df[('location', 'tracker', 'id')] = df[('location', 'tracker', 'id')].str.split(',')
print(df)
name address location
NaN office tracker
NaN street city country postcode id
0 Name1 Street1 City1 Country1 100 [9, 99, 02, 129]
1 Name2 Street2 City2 Country2 200 [1, 91, 95, 129]
2 Name3 Street3 City3 Country3 300 [99, 90, 259]
我想将其转换为嵌套的json。我是否可以知道将其转换为json以下的熊猫的方法。
[{ “ name”:“ Name1”, “地址”: { “办公室”:{ “ street”:“ Street1”, “ city”:“ City1”, “ country”:“ Country1”, “邮政编码”:100 } }, “位置”: { “跟踪器”:{ “ID”: [ “ 9”, “ 99”, “ 02”, “ 129” ] } }},{ “ name”:“ Name2”, “地址”: { “办公室”:{ “ street”:“ Street2”, “ city”:“ City2”, “ country”:“ Country2”, “邮政编码”:200 } }, “位置”: { “跟踪器”:{ “ID”: [ “ 1”, “ 91”, “ 95”, “ 129” ] } }},{ “ name”:“ Name3”, “地址”: { “办公室”:{ “ street”:“ Street3”, “ city”:“ City3”, “ country”:“ Country3”, “邮政编码”:300 } }, “位置”: { “跟踪器”:{ “ID”: [ “ 99”, “ 90”, “ 259” ] } }}]
虽然我可以用下面的代码获得以上结果,但是当记录数(df.shape [0])很大时,它就会变慢。
nested_dict = lambda: defaultdict(nested_dict)
result = nested_dict()
result_list = []
for cntr in range(df.shape[0]):
for i, j in df.iteritems():
value = j[cntr]
if not pd.isnull(i[2]):
result[i[0]][i[1]][i[2]] = value
elif not pd.isnull(i[1]):
result[i[0]][i[1]] = value
elif not pd.isnull(i[0]):
result[i[0]] = value
result_list.append(deepcopy(result))
print(json.dumps(result_list, indent=4))
我希望简化类似于
(df.groupby(level=['level0']).apply(lambda df: df.xs(df.name))).to_json()
但是,无法获得预期的结果。