大熊猫合并两个多层次系列

时间:2016-07-21 05:39:52

标签: pandas dataframe merge series multi-index

我有两个多级Series,并希望根据这两个索引合并它们。第一个Series看起来像这样:

                                              # of restaurants    
BORO           CUISINE      
BRONX          American                                425
               Chinese                                 330
               Pizza                                   206 
BROOKLYN       American                               1254
               Chinese                                 750
               Cafe/Coffee/Tea                         350

第二个有更多行,就像这样:

                                                # of votes    
BORO           CUISINE      
BRONX          American                                2425
               Caribbean                               320
               Chinese                                 3130
               Pizza                                   3336 
BROOKLYN       American                               21254
               Caribbean                               2320
               Chinese                                 7250
               Cafe/Coffee/Tea                         3350
               Pizza                                   13336 

1 个答案:

答案 0 :(得分:2)

设定:

s1 = pd.Series({('BRONX', 'American'): 425, ('BROOKLYN', 'Chinese'): 750, ('BROOKLYN', 'Cafe/Coffee/Tea'): 350, ('BRONX', 'Pizza'): 206, ('BROOKLYN', 'American'): 1254, ('BRONX', 'Chinese'): 330})
s2 = pd.Series({('BRONX', 'Caribbean'): 320, ('BRONX', 'American'): 2425, ('BROOKLYN', 'Chinese'): 7250, ('BROOKLYN', 'Cafe/Coffee/Tea'): 3350, ('BRONX', 'Pizza'): 3336, ('BROOKLYN', 'American'): 21254, ('BROOKLYN', 'Pizza'): 13336, ('BRONX', 'Chinese'): 3130, ('BROOKLYN', 'Caribbean'): 2320})
s1 = s1.rename_axis(['BORO','CUISINE']).rename('restaurants')
s2 = s2.rename_axis(['BORO','CUISINE']).rename('votes')


print (s1)
BORO      CUISINE        
BRONX     American            425
          Chinese             330
          Pizza               206
BROOKLYN  American           1254
          Chinese             750
          Cafe/Coffee/Tea     350
Name: restaurants, dtype: int64

print (s2)
BORO      CUISINE        
BRONX     American            2425
          Caribbean            320
          Chinese             3130
          Pizza               3336
BROOKLYN  American           21254
          Caribbean           2320
          Chinese             7250
          Cafe/Coffee/Tea     3350
          Pizza              13336
Name: votes, dtype: int64

如果需要join,请使用concat参数inner join

print (pd.concat([s1,s2], axis=1, join='inner'))
                          restaurants  votes
BORO     CUISINE                            
BRONX    American                 425   2425
         Chinese                  330   3130
         Pizza                    206   3336
BROOKLYN American                1254  21254
         Cafe/Coffee/Tea          350   3350
         Chinese                  750   7250

#join='outer' is by default, so can be omited
print (pd.concat([s1,s2], axis=1))
                          restaurants  votes
BORO     CUISINE                            
BRONX    American               425.0   2425
         Caribbean                NaN    320
         Chinese                330.0   3130
         Pizza                  206.0   3336
BROOKLYN American              1254.0  21254
         Cafe/Coffee/Tea        350.0   3350
         Caribbean                NaN   2320
         Chinese                750.0   7250
         Pizza                    NaN  13336

另一种解决方案是merge使用reset_index

#by default how='inner', so can be omited
print (pd.merge(s1.reset_index(), s2.reset_index(), on=['BORO','CUISINE']))
       BORO          CUISINE  restaurants  votes
0     BRONX         American          425   2425
1     BRONX          Chinese          330   3130
2     BRONX            Pizza          206   3336
3  BROOKLYN         American         1254  21254
4  BROOKLYN          Chinese          750   7250
5  BROOKLYN  Cafe/Coffee/Tea          350   3350

#outer join
print (pd.merge(s1.reset_index(), s2.reset_index(), on=['BORO','CUISINE'], how='outer'))
       BORO          CUISINE  restaurants  votes
0     BRONX         American        425.0   2425
1     BRONX          Chinese        330.0   3130
2     BRONX            Pizza        206.0   3336
3  BROOKLYN         American       1254.0  21254
4  BROOKLYN          Chinese        750.0   7250
5  BROOKLYN  Cafe/Coffee/Tea        350.0   3350
6     BRONX        Caribbean          NaN    320
7  BROOKLYN        Caribbean          NaN   2320
8  BROOKLYN            Pizza          NaN  13336