我的智慧结束了...我有一个三列的数据框(aff_id,mkt和bkgs)我按其中两个分组(aff_id和mkt):
undefined
给我一个看起来有点像这样的多索引数据框:
df_gb_aff = df.groupby(["affiliate_id", 'mkt']).sum()
df_gb_aff.sort('bkgs', ascending=False, inplace=True)
我现在要做的是遍历每个aff_id,并制作mkt(key) - bkgs(value)对的dict,但由于每个aff_id值具有不同的mkt值,因此当df_gb_aff时Python会抛出错误。 loc [index_1,index_2]不存在。
我一直在用这些索引获取索引:
bkgs
aff_id mkt
2508b863a1a4 bcab9d6ec630 1910.707124
6cc5f0e8c96b b7d0dbd38376 1374.924684
188e238326e4 446bb566f202 1206.589522
dbe759c691eb 1203.979908
6cc5f0e8c96b 0e9013464c4c 1203.532310
并尝试迭代:
aff_list = df_gb_aff.index.levels[0].values
mkt_list = df_gb_aff.index.levels[1].values
任何人都有明智的做法吗?
答案 0 :(得分:1)
dict理解的另一种解决方案:
d = {idx[1]: df_gb_aff.ix[idx][0] for idx in df_gb_aff.index}
print (d)
{'446bb566f202': 1206.589522,
'bcab9d6ec630': 1910.7071239999998,
'0e9013464c4c': 1203.5323100000001,
'dbe759c691eb': 1203.979908,
'b7d0dbd38376': 1374.9246840000001}
print (d['bcab9d6ec630'])
1910.707124
如果需要循环Multiindex
:
for idx in df_gb_aff.index:
print (idx)
print (df_gb_aff.ix[idx])
bkgs 1910.707124
Name: (2508b863a1a4, bcab9d6ec630), dtype: float64
('6cc5f0e8c96b', 'b7d0dbd38376')
bkgs 1374.924684
Name: (6cc5f0e8c96b, b7d0dbd38376), dtype: float64
('188e238326e4', '446bb566f202')
bkgs 1206.589522
Name: (188e238326e4, 446bb566f202), dtype: float64
('188e238326e4', 'dbe759c691eb')
bkgs 1203.979908
Name: (188e238326e4, dbe759c691eb), dtype: float64
('6cc5f0e8c96b', '0e9013464c4c')
bkgs 1203.53231
Name: (6cc5f0e8c96b, 0e9013464c4c), dtype: float64