分组后总列数的百分比

时间:2019-07-18 18:32:42

标签: pandas pandas-groupby

尝试汇总熊猫数据框,并根据原始df的groupby结果计算“总计百分比”列。

原始df:

        Shape_Area                       LU
0  91254232.781776          Fallow Cropland
1    522096.071094  Mixed Wetland Hardwoods
2     87795.467187  Mixed Wetland Hardwoods
3       440.528367  Mixed Wetland Hardwoods
4    778952.154436         Dikes and Levees

分组结果:

                              Shape_Area
LU                                      
Dikes and Levees           778952.154436
Fallow Cropland          91254232.781776
Mixed Wetland Hardwoods    610332.066649

我想为每种LU类型添加一个额外的“总计PCT”列。我不确定我是否正确访问了groupby结果,可能不了解它是什么(一系列?)。

df = pd.DataFrame(narr, columns=['LU','Shape_Area'])
df = df.groupby(['LU'])[['Shape_Area']].sum()

#to print the example above after groupby
print df

1 个答案:

答案 0 :(得分:1)

您可以简单地计算Shape_Area系列的总和(返回标量),然后将分组数据框中的Shape_Area的每一行除以该值。

grouped = df.groupby(['LU'])[['Shape_Area']].sum()
grouped['pct'] = grouped['Shape_Area'] / grouped['Shape_Area'].sum()
                           Shape_Area       pct
LU                                             
Dikes and Levees         7.789522e+05  0.008408
Fallow Cropland          9.125423e+07  0.985004
Mixed Wetland Hardwoods  6.103321e+05  0.006588