Pandas为层次索引的内部级别增加了价值

时间:2017-03-04 19:13:13

标签: python pandas dataframe multi-index

我有一个带有分层索引(MultiIndex)的Pandas DataFrame。我通过将“cousub”和“year”的值分组来创建此DataFrame。

annualMed = df.groupby(["cousub", "year"])[["ratio", "sr_val_transfer"]].median().round(2)
print annualMed.head(8)    

                      ratio  sr_val_transfer
cousub          year                        
Allen Park city 2013   0.51          75000.0
                2014   0.47          85950.0
                2015   0.47          95030.0
                2016   0.45         102500.0
Belleville city 2013   0.49         113900.0
                2014   0.55         114750.0
                2015   0.53         149000.0
                2016   0.48         121500.0    

我想在“年”级别添加一个“整体”值,然后我可以根据单独的“cousub”分组填充值,即排除“年”。我希望结果看起来像下面的

                      ratio  sr_val_transfer
cousub          year                        
Allen Park city 2013   0.51          75000.0
                2014   0.47          85950.0
                2015   0.47          95030.0
                2016   0.45         102500.0
             Overall   0.50          90000.0
Belleville city 2013   0.49         113900.0
                2014   0.55         114750.0
                2015   0.53         149000.0
                2016   0.48         121500.0 
             Overall   0.50         135000.0

如何将此新项目添加到MultiIndex的“年”级别?

1 个答案:

答案 0 :(得分:1)

如果您只想显式添加这两列,则可以使用loc指定所有MultiIndex级别。

df.loc[('Allen Park city', 'Overall'), :] = (0.50, 90000.)
df.loc[('Belleville city', 'Overall'), :] = (0.50, 135000.)

如果你有一个完整的列表城市,你想要添加这一行,这将有点单调乏味。也许你可以append另一个带有overall值的DataFrame,并带有一些索引操作。

(df.reset_index()
   .append(pd.DataFrame([['Allen Park city', 'Overall', 0.5, 90000.], 
                         ['Belleville city', 'Overall', 0.5, 135000.]], 
                         columns=list(df.index.names) + list(df.columns)))
   .set_index(df.index.names)
   .sort_index())

演示

方法1(较小的情况)

>>> df.loc[('Allen Park city', 'Overall'), :] = (0.50, 90000.)

>>> df.loc[('Belleville city', 'Overall'), :] = (0.50, 135000.)

>>> df.sort_index()

                         ratio  sr_val_transfer
cousub          year                           
Allen Park city 2013      0.51          75000.0
                2014      0.47          85950.0
                2015      0.47          95030.0
                2016      0.45         102500.0
                Overall   0.50          90000.0
Belleville city 2013      0.49         113900.0
                2014      0.55         114750.0
                2015      0.53         149000.0
                2016      0.48         121500.0
                Overall   0.50         135000.0

方法2(较大的情况)

>>> (df.reset_index()
       .append(pd.DataFrame([['Allen Park city', 'Overall', 0.5, 90000.], 
                             ['Belleville city', 'Overall', 0.5, 135000.]], 
                             columns=list(df.index.names) + list(df.columns)))
       .set_index(df.index.names)
       .sort_index())

                         ratio  sr_val_transfer
cousub          year                           
Allen Park city 2013      0.51          75000.0
                2014      0.47          85950.0
                2015      0.47          95030.0
                2016      0.45         102500.0
                Overall   0.50          90000.0
Belleville city 2013      0.49         113900.0
                2014      0.55         114750.0
                2015      0.53         149000.0
                2016      0.48         121500.0
                Overall   0.50         135000.0