Pandas使用sort_values对2个数据帧进行排序,然后按日期进行子排序

时间:2016-07-11 05:41:11

标签: python sorting pandas

我有两个数据帧,包含相似类型的信息。我试图将它们合并到一起并重新组织它们。以下是数据框的示例:

df1 = 
Member Nbr    Name-First      Name-Last      Date-Join 
20                    Zoe        Soumas     2011-08-01   
3128               Julien        Bougie     2011-07-22   
3535               Michel        Bibeau     2015-02-18   
4116          Christopher        Duthie     2014-12-02   
4700                Manoj       Chauhan     2014-11-11   
4802                 Anna        Balian     2014-07-26   
5004             Abdullah         Cekic     2012-03-12   
5130             Raymonde        Girard     2011-01-04  



df2 =      
Member Nbr    Name-First      Name-Last      Date-Join 
3762              Robert        Ortopan     2010-01-31   
3762              Robert        Ortopan     2010-02-28   
3892           Christian         Burnet     2010-03-24   
3892           Christian         Burnet     2010-04-24   
5022              Robert      Ngabirano     2010-06-25   
5022              Robert      Ngabirano     2010-07-28 

我想要的是一个按Member Nbr排序的数据框,如果该成员出现多次,那么它将按连接日期再次进行orgonized。所以我会:

df12 =  
Member Nbr    Name-First      Name-Last      Date-Join 
20                   Zoe         Soumas     2011-08-01   
3128              Julien         Bougie     2011-07-22   
3535              Michel         Bibeau     2015-02-18  
3762              Robert        Ortopan     2010-01-31   
3762              Robert        Ortopan     2010-02-28
3892           Christian         Burnet     2010-03-24   
3892           Christian         Burnet     2010-04-24     
4116         Christopher         Duthie     2014-12-02   
4700               Manoj        Chauhan     2014-11-11   
4802                Anna         Balian     2014-07-26   
5004            Abdullah          Cekic     2012-03-12
5022              Robert      Ngabirano     2010-06-25   
5022              Robert      Ngabirano     2010-07-28    
5130            Raymonde         Girard     2011-01-04 

我设法使用df12 = pd.concat([df1, df2], ignore_index=True)df2放在df1的底部,以便将这两个数据框连接起来。使用后

df12.sort_values(by='Member Nbr', axis=0, inplace=True)

成员按升序排列,但出现不止一次(在不同的加入日期)的成员按降序排列。那是

Member Nbr    Name-First      Name-Last      Date-Join 
20                   Zoe         Soumas     2011-08-01   
3128              Julien         Bougie     2011-07-22   
3535              Michel         Bibeau     2015-02-18  
3762              Robert        Ortopan     2010-02-28  # Wrongly sorted 
3762              Robert        Ortopan     2010-01-31
3892           Christian         Burnet     2010-04-24  # Wrongly sorted  
3892           Christian         Burnet     2010-03-24     
4116         Christopher         Duthie     2014-12-02   
4700               Manoj        Chauhan     2014-11-11   
4802                Anna         Balian     2014-07-26   
5004            Abdullah          Cekic     2012-03-12
5022              Robert      Ngabirano     2010-07-28 # Wrongly sorted   
5022              Robert      Ngabirano     2010-06-25    
5130            Raymonde         Girard     2011-01-04

是否有办法让那些有多个加入日期的成员按日期按升序排列?

1 个答案:

答案 0 :(得分:1)

by参数可以是列的列表,以便数据框首先按第一列排序(对于第二列的绑定,以及第三列的绑定等)。

df12.sort_values(by=['Member Nbr', 'Date-Join'], inplace=True)

产生

    Member Nbr   Name-First  Name-Last  Date-Join
0           20          Zoe     Soumas 2011-08-01
1         3128       Julien     Bougie 2011-07-22
2         3535       Michel     Bibeau 2015-02-18
4         3762       Robert    Ortopan 2010-01-31
3         3762       Robert    Ortopan 2010-02-28
6         3892    Christian     Burnet 2010-03-24
5         3892    Christian     Burnet 2010-04-24
7         4116  Christopher     Duthie 2014-12-02
8         4700        Manoj    Chauhan 2014-11-11
9         4802         Anna     Balian 2014-07-26
10        5004     Abdullah      Cekic 2012-03-12
12        5022       Robert  Ngabirano 2010-06-25
11        5022       Robert  Ngabirano 2010-07-28
13        5130     Raymonde     Girard 2011-01-04

请注意,要使其正常工作,Date-Join列的类型应为datetime。