按熊猫分组求和和除法

时间:2019-12-30 15:59:22

标签: pandas pandas-groupby

我有一个df,如下所示:

Place       Occupancy     Number
Bangalore   Occupied      80    
Bangalore   Vacant       20
Chennai     Occupied      90
Chennai     Vacant       60
Delhi       Occupied      20
Delhi       Vacant       20

我正在尝试生成以下内容:

Place         Occupancy_%     Total_Number   Number_vacant  Number_occupied
Bangalore     80              100            20             80
Chennai       60              150            60             90
Bangalore     50              40             20             20

3 个答案:

答案 0 :(得分:2)

您可以执行以下操作:

new_df = df.pivot(index='Place', columns='Occupancy', values='Number')
new_df['Total_Number'] = new_df.sum(1)
new_df['Occupancy_%'] = new_df['Occupied']/new_df['Total_Number'] * 100

输出:

Occupancy  Occupied  Vacant  Total_Number  Occupancy_%
Place                                                 
Bangalore        80      20           100         80.0
Chennai          90      60           150         60.0
Delhi            20      20            40         50.0

答案 1 :(得分:2)

让我们用crosstabassign的一行来完成

df=pd.crosstab(index=df.Place,
               columns=df.Occupancy,
               values=df.Number,
               aggfunc='sum',
               margins = True,
               margins_name ='Total_number' ).drop('Total_number').\
       assign(Occupancy=lambda  x : x['Occupied']*100/x['Total_number'] )
Out[128]: 
Occupancy  Occupied  Vacant  Total_number  Occupancy
Place                                               
Bangalore        80      20           100       80.0
Chennai          90      60           150       60.0
Delhi            20      20            40       50.0

答案 2 :(得分:1)

我将分两步执行此操作。首先,生成数据透视表:

df_pivot = df.pivot(index='Place',columns='Occupancy',values='Number')

Occupancy  Occupied  Vacant
Place                      
Bangalore        80      20
Chennai          90      60
Delhi            20      20

第二,计算以达到所需的结果:

df_pivot['Total_Number'] = df_pivot[['Occupied','Vacant']].sum(axis=1)
df_pivot['Occupied_Pct'] = df_pivot['Occupied'] / df_pivot['Total_Number'] 
df_pivot['Vacant_Pct'] = df_pivot['Vacant'] / df_pivot['Total_Number'] 

Occupancy  Occupied  Vacant  Total_Number  Occupied_Pct  Vacant_Pct
Place                                                              
Bangalore        80      20           100           0.8         0.2
Chennai          90      60           150           0.6         0.4
Delhi            20      20            40           0.5         0.5