我尝试使用两列中的groupby
重命名两列。
fun = {'Age':{'mean_age':'mean', 'median_age':'median'}}
groupbyClass2 = mc_response.groupby(['Country','GenderSelect']).agg(fun).reset_index()
groupbyClass2.columns = groupbyClass2.columns.droplevel(0)
数据框如下所示:
mean_age median_age
0 Argentina Female 33.000000 33.0
1 Argentina Male 33.294118 32.0
2 Australia Female 35.000000 34.0
3 Australia Male 37.158416 36.0
现在,我想将第一列重命名为“国家”,将第二列重命名为“性别”。我尝试了以下代码,但是这两列都将重命名为“性别”。我该如何解决?
groupbyClass2.rename(columns = {groupbyClass2.columns[0]:'Country', groupbyClass2.columns[1]:'Gender'},inplace = True)
答案 0 :(得分:1)
您可以在groupby
之后为聚合指定列,因此可以通过聚合函数为新的列名称传递list
元组为新列名称,为列名称的MultiIndex
名称添加DataFrame.rename_axis
在reset_index
之后:
print (mc_response)
Country GenderSelect Age
0 Argentina Female 10
1 Australia Male 20
2 Australia Female 30
3 Australia Male 43
fun = [('mean_age', 'mean'), ('median_age','median')]
groupbyClass2 = (mc_response.groupby(['Country','GenderSelect'])['Age']
.agg(fun)
.rename_axis(['Country','Gender'])
.reset_index())
print (groupbyClass2)
Country Gender mean_age median_age
0 Argentina Female 10.0 10.0
1 Australia Female 30.0 30.0
2 Australia Male 31.5 31.5
您的解决方案应使用分配新列表-列表中的第一个值,然后使用索引转换所有列:
df.columns = ['Country','Gender'] + df.columns[2:].tolist()
答案 1 :(得分:0)
这可能会更容易:
df.groupby(['Country','Gender'])['Age'].agg([np.mean,np.median]).add_suffix('_age').reset_index()
输出:
Country Gender mean_age median_age
0 Argentina Female 10.0 10.0
1 Australia Female 30.0 30.0
2 Australia Male 31.5 31.5