纠正熊猫索引的排序顺序

时间:2018-09-10 22:46:13

标签: python pandas numpy dataframe

我有一个如下所示的数据框。我的Date字段的类型为datetime64[ns]

           symbol        high         low
Date                                      
2018-08-16     spy  285.040009  283.359985
2018-08-17     spy  285.559998  283.369995
2018-08-16    nflx  331.170013  321.209991
2018-08-17    nflx  324.369995  312.959991
2017-07-17     spy  245.910004  245.330002
2017-07-18     spy  245.720001  244.669998

我的目标是先通过symbol设置索引,然后再通过Date设置索引,如下所示:

                          high         low
symbol Date 
spy     2017-07-17  245.910004  245.330002
        2017-07-18  245.720001  244.669998                             
        2018-08-16  285.040009  283.359985
        2018-08-17  285.559998  283.369995
nflx    2018-08-16  331.170013  321.209991
        2018-08-17  324.369995  312.959991

以下是我的尝试: 通过这样做重置日期索引后,输出如下所示:

df.reset_index(level=['Date'], inplace=True)

        Date symbol        high         low
0 2018-08-16     spy  285.040009  283.359985
1 2018-08-17     spy  285.559998  283.369995
2 2018-08-16    nflx  331.170013  321.209991
3 2018-08-17    nflx  324.369995  312.959991
4 2017-07-17     spy  245.910004  245.330002
5 2017-07-18     spy  245.720001  244.669998

最后在symbol和Date上设置索引,这将返回不需要的输出:

df.set_index(['symbol', 'Date'], inplace=True)

                          high         low
symbol Date                              
spy     2018-08-16  285.040009  283.359985
        2018-08-17  285.559998  283.369995
nflx    2018-08-16  331.170013  321.209991
        2018-08-17  324.369995  312.959991
spy     2017-07-17  245.910004  245.330002
        2017-07-18  245.720001  244.669998

2 个答案:

答案 0 :(得分:1)

IIUC,您可以尝试使用swaplevel后跟sort_index

df.set_index('symbol', append=True).swaplevel().sort_index(level=[0,1],ascending=[False,True])

                         high         low
symbol Date                              
spy    2017-07-17  245.910004  245.330002
       2017-07-18  245.720001  244.669998
       2018-08-16  285.040009  283.359985
       2018-08-17  285.559998  283.369995
nflx   2018-08-16  331.170013  321.209991
       2018-08-17  324.369995  312.959991

答案 1 :(得分:1)

不是inplace的粉丝,请尝试pd.sort_index()

df.reset_index(level=['Date'], inplace= True)
df.set_index(['symbol', 'Date'], inplace=True)
print(df.sort_index())

输出:

                        high         low
symbol Date                              
nflx   2018-08-16  331.170013  321.209991
       2018-08-17  324.369995  312.959991
spy    2017-07-17  245.910004  245.330002
       2017-07-18  245.720001  244.669998
       2018-08-16  285.040009  283.359985
       2018-08-17  285.559998  283.369995