Question

我已经完成了一个熊猫分组

grouped = df.groupby(['name','type'])['count'].count().reset_index()

看起来像这样：

name  type    count
x     a       32
x     b       1111
x     c       4214

我需要做的是获取并生成百分比，所以我会得到类似的信息（我意识到这些百分比是不正确的）：

name  type  count
x     a     1%
x     b     49%
x     c     50%

我能想到一些可能有用的伪代码，但是我还没有得到任何实际可行的...

类似

def getPercentage(df):
    for name in df: 
        total = 0
        where df['name'] = name:
            total = total + df['count'] 
            type_percent = (df['type'] / total) * 100
            return type_percent

df.apply(getPercentage)

有没有很好的方法可以对付熊猫？

Answer 1

尝试：

<Text />

Answer 2

通过按如下所示传递参数“ normalize = False”，可以对任何系列进行归一化（比按计数来设计更清洁）：

Series.value_counts(normalize=True, sort=True, ascending=False) 因此，它将类似于（这是一个序列，而不是一个数据框）：

df['type'].value_counts(normalize=True) * 100

或者，如果您使用groupby，则只需执行以下操作：

total = grouped['count'].sum()
grouped['count'] = grouped['count']/total * 100

Answer 3

使用<rowset ...> <row .../> ... </rowset> + crosstab

normalize

熊猫获得groupby的百分比价值

3 个答案: