我有一个如下所示的多索引数据框:
2019-01-08 2019-01-15 2019-01-22 2019-01-29 2019-02-05
6392 height 3 6 5 3 3
length 3 3 5 9 3
6393
height 1 6 1 4 3
length 5 3 2 3 3
我想将其转换为类似于以下内容的JSON。
{
"6392": {
"2019-01-08": [{
"height": 3
"length": 3
}],
"2019-01-15": [{
"height":
"length": 3
}],
"2019-012-22": [{
"height": 5
"length": 5
}],
...
},
"6393": {
"2019-01-08": [{
"height": 1
"length": 5
}],
"2019-01-15": [{
"height": 6
"length": 3
}],
"2019-012-22": [{
"height": 1
"length": 2
}],
...
}
我尝试了类似df.to_json(orient='index')
的操作,该操作返回错误。而且使用reset_index()
不会返回我想要的层次!
感谢您的帮助。
答案 0 :(得分:2)
根据Quang的建议,我将按照这种方式处理您的实际数据集:
import numpy as np
import pandas as pd
arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux']),
np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])]
df = pd.DataFrame(np.random.randn(8,4), index=arrays, columns=['col1','col2','col3','col4'])
D = df.groupby(level=0).apply(lambda df: df.xs(df.name).to_dict()).to_dict()
输出此字典:
{'bar': {'col1': {'one': -0.9687674292695906, 'two': -0.7892120308117504},
'col2': {'one': -0.08468610899521901, 'two': -0.8123345931126713},
'col3': {'one': 0.8136040202024982, 'two': 1.4254756109087028},
'col4': {'one': -0.5631944934736082, 'two': -1.0686604230467418}},
'baz': {'col1': {'one': -0.8329277599190955, 'two': -0.797572943803082},
'col2': {'one': -1.18912237734452, 'two': -0.6222985373781997},
'col3': {'one': -0.6307550007277682, 'two': -0.43423342334272047},
'col4': {'one': -0.8090341502048565, 'two': 1.7846384031629874}},
'foo': {'col1': {'one': 0.17441065807207026, 'two': -0.142104023898428},
'col2': {'one': 0.4865273350791687, 'two': 1.4119728392158484},
'col3': {'one': -1.7834681421564647, 'two': 0.9228194356473829},
'col4': {'one': -0.7426715146036388, 'two': 0.32663534732439187}},
'qux': {'col1': {'one': -0.32243916994536376, 'two': -0.4490530023512784},
'col2': {'one': 0.31957291028411916, 'two': -1.6707253441375334},
'col3': {'one': 0.2794431740425791, 'two': 1.0928413422340624},
'col4': {'one': -0.818204166504019, 'two': -1.2567773847741046}}}
可以使用以下命令将其转换为json文件:
import json
with open('/path/to/file.json', 'w') as json_file:
json.dump(D, json_file)
即:
{
"bar":{
"col1":{
"one":-0.9687674292695906,
"two":-0.7892120308117504
},
"col2":{
"one":-0.08468610899521901,
"two":-0.8123345931126713
},
"col3":{
"one":0.8136040202024982,
"two":1.4254756109087028
},
"col4":{
"one":-0.5631944934736082,
"two":-1.0686604230467418
}
},
"baz":{
"col1":{
"one":-0.8329277599190955,
"two":-0.797572943803082
},
"col2":{
"one":-1.18912237734452,
"two":-0.6222985373781997
},
"col3":{
"one":-0.6307550007277682,
"two":-0.43423342334272047
},
"col4":{
"one":-0.8090341502048565,
"two":1.7846384031629874
}
},
...
距离您的需求足够近吗?
答案 1 :(得分:1)
这是您的数据集:
df = pd.DataFrame({'2019-01-08': [3, 3, 1, 5], '2019-01-15': [6,3,6,3]},
index=[[6392, 6392, 6393, 6393], ['height', 'length', 'height', 'length']])
df
# 2019-01-08 2019-01-15
# 6392 height 3 6
# length 3 3
# 6393 height 1 6
# length 5 3
,这将在Quang的建议下完成所需的JSON转换:
D = (df
.groupby(level=0)
.apply(lambda df: df.xs(df.name).to_dict())
.to_dict()
)
D
# {6392: {'2019-01-08': {'height': 3, 'length': 3},
# '2019-01-15': {'height': 6, 'length': 3}},
# 6393: {'2019-01-08': {'height': 1, 'length': 5},
# '2019-01-15': {'height': 6, 'length': 3}}}
如果您坚持将内在字典包裹在列表中,那就这样做
for k in D:
for m in D[k]:
D[k][m] = [D[k][m]]
D
# {6392: {'2019-01-08': [{'height': 3, 'length': 3}],
# '2019-01-15': [{'height': 6, 'length': 3}]},
# 6393: {'2019-01-08': [{'height': 1, 'length': 5}],
# '2019-01-15': [{'height': 6, 'length': 3}]}}