用两列分组后重命名列

时间:2019-03-31 11:57:36

标签: python pandas dataframe data-science

我尝试使用两列中的groupby重命名两列。

fun = {'Age':{'mean_age':'mean', 'median_age':'median'}}
groupbyClass2 = mc_response.groupby(['Country','GenderSelect']).agg(fun).reset_index()
groupbyClass2.columns = groupbyClass2.columns.droplevel(0) 

数据框如下所示:

                        mean_age    median_age
0   Argentina   Female  33.000000   33.0
1   Argentina   Male    33.294118   32.0
2   Australia   Female  35.000000   34.0
3   Australia   Male    37.158416   36.0

现在,我想将第一列重命名为“国家”,将第二列重命名为“性别”。我尝试了以下代码,但是这两列都将重命名为“性别”。我该如何解决?

groupbyClass2.rename(columns = {groupbyClass2.columns[0]:'Country', groupbyClass2.columns[1]:'Gender'},inplace = True)

2 个答案:

答案 0 :(得分:1)

您可以在groupby之后为聚合指定列,因此可以通过聚合函数为新的列名称传递list元组为新列名称,为列名称的MultiIndex名称添加DataFrame.rename_axisreset_index之后:

print (mc_response)
     Country GenderSelect  Age
0  Argentina       Female   10
1  Australia         Male   20
2  Australia       Female   30
3  Australia         Male   43

fun = [('mean_age', 'mean'), ('median_age','median')]
groupbyClass2 = (mc_response.groupby(['Country','GenderSelect'])['Age']
                            .agg(fun)
                            .rename_axis(['Country','Gender'])
                            .reset_index())
print (groupbyClass2)
     Country  Gender  mean_age  median_age
0  Argentina  Female      10.0        10.0
1  Australia  Female      30.0        30.0
2  Australia    Male      31.5        31.5

您的解决方案应使用分配新列表-列表中的第一个值,然后使用索引转换所有列:

df.columns = ['Country','Gender'] + df.columns[2:].tolist()

答案 1 :(得分:0)

这可能会更容易:

df.groupby(['Country','Gender'])['Age'].agg([np.mean,np.median]).add_suffix('_age').reset_index()

输出:

     Country  Gender  mean_age  median_age
0  Argentina  Female      10.0        10.0
1  Australia  Female      30.0        30.0
2  Australia    Male      31.5        31.5