我有一个如下所示的数据框
this.builderNavbarService.navbarLogoImage.next('https://i.imgur.com/v7yoDaH.png');
在上面的数据框中,我想在下面的数据框中进行准备。
this.builderNavbarService.navbarLogoImage.next(null);
答案 0 :(得分:2)
在此需要将GroupBy.agg
与聚合函数字典一起使用,此处DataFrameGroupBy.nunique
和DataFrameGroupBy.size
用于计数:
#aggregate sum per 2 columns Sector and Usage
df1 = df.groupby(['Sector', 'Unit_usage'])['Unit_Area'].sum()
#percentage by division of total per Sector
df1 = df1.div(df1.sum(level=0), level=0).unstack(fill_value=0).mul(100).add_prefix('%_')
#aggregate sum per 2 columns Sector and Status
df2 = df.groupby(['Sector', 'Rent_Unit_Status'])['Unit_Area'].sum()
df2 = df2.div(df2.sum(level=0), level=0).unstack(fill_value=0).mul(100).add_prefix('%_')
#aggregations
s = df.groupby('Sector').agg({'Property_ID':'nunique','Unit_ID':'size', 'Unit_Area':'sum'})
s = s.rename(columns={'Property_ID':'No_of_Properties','Unit_ID':'No_of_Units',
'Unit_Area':'Total_area'})
#join all together
df = pd.concat([s, df1, df2], axis=1).reset_index()
print (df)
Sector No_of_Properties No_of_Units Total_area %_Apartment %_Resid \
0 SE1 2 5 800 12.5 25.0
1 SE2 2 3 1000 50.0 40.0
%_Shop %_Rented %_Vacant
0 62.5 62.5 37.5
1 10.0 40.0 60.0
熊猫0.25+解决方案:
#aggregate sum per 2 columns Sector and Usage
df1 = df.groupby(['Sector', 'Unit_usage'])['Unit_Area'].sum()
#percentage by division of total per Sector
df1 = df1.div(df1.sum(level=0), level=0).unstack(fill_value=0).mul(100).add_prefix('%_')
#aggregate sum per 2 columns Sector and Status
df2 = df.groupby(['Sector', 'Rent_Unit_Status'])['Unit_Area'].sum()
df2 = df2.div(df2.sum(level=0), level=0).unstack(fill_value=0).mul(100).add_prefix('%_')
#aggregations
s = df.groupby('Sector').agg(No_of_Properties=('Property_ID','nunique'),
No_of_Units=('Unit_ID','size'),
Total_area= ('Unit_Area','sum'))
#join all together
df = pd.concat([s, df1, df2], axis=1).reset_index()
print (df)
Sector No_of_Properties No_of_Units Total_area %_Apartment %_Resid \
0 SE1 2 5 800 12.5 25.0
1 SE2 2 3 1000 50.0 40.0
%_Shop %_Rented %_Vacant
0 62.5 62.5 37.5
1 10.0 40.0 60.0
答案 1 :(得分:0)
更新:现在计算总面积的百分比。
您可以为此使用pd.groupby.apply
。
def summarise(df):
output = pd.Series()
output['No_of_Properties'] = df['Property_ID'].nunique()
output['No_of_Units'] = df['Unit_ID'].size
output['Total_area'] = df['Unit_Area'].sum()
output['%_Rented'] = (df['Unit_Area'].loc[df['Rent_Unit_Status'] == 'Rented'].sum() / output['Total_area']) * 100
output['%_Shop'] = (df['Unit_Area'].loc[df['Unit_usage'] == 'Shop'].sum() / output['Total_area']) * 100
output['%_Apartment'] = (df['Unit_Area'].loc[df['Unit_usage'] == 'Apartment'].sum() / output['Total_area']) * 100
return output
print(df.groupby('Sector').apply(summarise))
输出:
No_of_Properties No_of_Units Total_area %_Rented %_Shop \
Sector
SE1 2.0 5.0 800.0 62.5 62.5
SE2 2.0 3.0 1000.0 40.0 10.0
%_Apartment
Sector
SE1 12.5
SE2 50.0