大熊猫将嵌套字典转换为mutiIndex行和列

时间:2020-02-27 19:24:19

标签: python pandas dataframe dictionary multi-index

我有一个嵌套的字典,我想使其像下面的multiIndex行和列。但是我的数据以某种方式丢失在表中。

    test= {12: {'Category 1': {'TestA': {'att_1': 1, 'att_2': 'whatever'}, 'TestB': {'att_1': 3, 'att_2': 'spring'}}, 'Category 2': {'TestA': {'att_1': 23, 'att_2': 'another'}, 'TestB': {'att_1': 9, 'att_2': 'summer'}}}, 15: {'Category 1': {'TestA': {'att_1': 10, 'att_2': 'foo'}, 'TestB': {'att_1': 29, 'att_2': 'fall'}}, 'Category 2': {'TestA': {'att_1': 30, 'att_2': 'bar'}, 'TestB': {'att_1': 36, 'att_2': 'winter'}}}}
columns=pd.MultiIndex.from_arrays([['TestA','TestA','TestB','TestB'],['att_1','att_2','att_1','att_2']])

我想要的格式:

              TestA       TestB      
              att_1 att_2 att_1 att_2
12 Category 1   NaN   NaN   NaN   NaN
   Category 2   NaN   NaN   NaN   NaN
15 Category 1   NaN   NaN   NaN   NaN
   Category 2   NaN   NaN   NaN   NaN

我做到了

    pd.DataFrame(test,index=pd.MultiIndex.from_arrays([[12,12,15,15],['Category 1','Category 2','Category 1','Category 2']]),columns=pd.MultiIndex.from_arrays([['TestA','TestA','TestB','TestB'],['att_1','att_2','att_1','att_2']]))

我的数据丢失如下:

             TestA       TestB      
              att_1 att_2 att_1 att_2
12 Category 1   NaN   NaN   NaN   NaN
   Category 2   NaN   NaN   NaN   NaN
15 Category 1   NaN   NaN   NaN   NaN
   Category 2   NaN   NaN   NaN   NaN

如果我只有multiIndex行,那行得通,但是我想要multiIndex行和列。

     pd.DataFrame.from_dict({(i,j): test[i][j] 
                           for i in test.keys() 
                           for j in test[i].keys()},
                       orient='index')

                                           TestA                             TestB
12 Category 1  {'att_1': 1, 'att_2': 'whatever'}   {'att_1': 3, 'att_2': 'spring'}
   Category 2  {'att_1': 23, 'att_2': 'another'}   {'att_1': 9, 'att_2': 'summer'}
15 Category 1      {'att_1': 10, 'att_2': 'foo'}    {'att_1': 29, 'att_2': 'fall'}
   Category 2      {'att_1': 30, 'att_2': 'bar'}  {'att_1': 36, 'att_2': 'winter

1 个答案:

答案 0 :(得分:0)

您可以通过以下方式获得所需的数据框:

Start_date=[1/1/2020, 2/1/2020, 3/1/2020]
End_date=[1/31/2020, 2/29/2020, 3/31/2020] 
Prev_Date=[12/31/2019,1/31/2020,2/29/2020]
Table_Name=[P1,P2,P3]

--and have a SQL script which runs like below

select a.metric1 as &Table_Name,
       b.metric2,
       c.metric3 
into &Table_Name(i)
from 
(
select metric1 from table
where Date between &Start_Date(i) & &End_Date(i)
)a
inner join 
(
select metric2 from tableX
where date<=&End_Date(i)
)b
on a.key=b.key
inner join 
(
select metric3 from tableX
where date<=&Prev_Date(i)
)c
on a.key=c.key;