我有一个带有MultiIndex的数据框。我想知道我是否以正确的方式创建了数据框(见下文)。
01.01 02.01 03.01 04.01
bar total1 40 52 18 11
total2 36 85 5 92
baz total1 23 39 45 70
total2 50 49 51 65
foo total1 23 97 17 97
total2 64 56 94 45
qux total1 13 73 38 4
total2 80 8 61 50
df.index.values
导致:
array([('bar', 'total1'), ('bar', 'total2'), ('baz', 'total1'),
('baz', 'total2'), ('foo', 'total1'), ('foo', 'total2'),
('qux', 'total1'), ('qux', 'total2')], dtype=object)
df.index.get_level_values
导致:
<bound method MultiIndex.get_level_values of MultiIndex(levels=[[u'bar', u'baz', u'foo', u'qux'], [u'total1', u'total2']],
labels=[[0, 0, 1, 1, 2, 2, 3, 3], [0, 1, 0, 1, 0, 1, 0, 1]],names=[]
我最终希望将df转换为字典词典,以便第一个词典键是['bar','baz','foo','qux']之一,值是日期和内部字典由'total1'和'totals2'组成,并且值是df的整数。 另一种解释是,例如,如果dict1是dict然后调用:
dict1['bar']
将导致输出:
{u'bar':{'01.01':{'total1':40,'total2':36},'02.01':{'total1':52,'total2':85},'03.01':{'total1':18,'total2':5},'04.01':{'total1':11,'total2':92} } }
为了达到这个目的,我需要改变的方式和内容是什么?这是一个索引问题吗?
答案 0 :(得分:10)
将整个数据框转换为字典尝试:
df.groupby(level=0).apply(lambda df: df.xs(df.name).to_dict()).to_dict()
{'bar': {'01.01': {'total1': 40, 'total2': 36},
'02.01': {'total1': 52, 'total2': 85},
'03.01': {'total1': 18, 'total2': 5},
'04.01': {'total1': 11, 'total2': 92}},
'baz': {'01.01': {'total1': 23, 'total2': 50},
'02.01': {'total1': 39, 'total2': 49},
'03.01': {'total1': 45, 'total2': 51},
'04.01': {'total1': 70, 'total2': 65}},
'foo': {'01.01': {'total1': 23, 'total2': 64},
'02.01': {'total1': 97, 'total2': 56},
'03.01': {'total1': 17, 'total2': 94},
'04.01': {'total1': 97, 'total2': 45}},
'qux': {'01.01': {'total1': 13, 'total2': 80},
'02.01': {'total1': 73, 'total2': 8},
'03.01': {'total1': 38, 'total2': 61},
'04.01': {'total1': 4, 'total2': 50}}}
要转换一个特定列,请在将其转换为字典之前选择,即
df.groupby(level=0).apply(lambda df: df.xs(df.name)[colname].to_dict()).to_dict()