计算熊猫中非数字列的平均值

时间:2020-07-30 07:59:48

标签: pandas

我有一个如下的df“数据”

Name    Quality city
Tom     High    A
nick    Medium  B
krish   Low     A
Jack    High    A
Kevin   High    B
Phil    Medium  B

我想按城市对其进行分组,并根据“质量”列创建一个新列,并如下计算avegare

 city  High Medium Low High_Avg Medium_AVG Low_avg
 A    2      0     1    66.66       0      33.33
 B    1      1     0     50         50        0

我尝试使用以下脚本,但我知道这是完全错误的。 data_average = data_df.groupby(['city'],as_index = False).count()

1 个答案:

答案 0 :(得分:1)

获取频率计数,将结果除以各列之和,最后将datframe连接成一个:

result = pd.crosstab(df.city, df.Quality)
averages = result.div(result.sum(1).array, axis=0).mul(100).round(2).add_suffix("_Avg")
#combine the dataframes
pd.concat((result, averages), axis=1)

Quality High    Low Medium  High_Avg    Low_Avg Medium_Avg
city                        
 A       2       1     0    66.67       33.33   0.00
 B       1       0     2    33.33       0.00    66.67