熊猫:如何获得一组填充该组每行的小组计数?

时间:2017-05-06 15:46:04

标签: python pandas

我可以使用群组计数成功填充我的新列,但我怀疑有一种更简单的方法:

# How do I simplify this?

def f(gr):

    return pd.Series([gr['class_name'].count()] * gr.shape[0], index=gr.index)

df['class_size'] = df.groupby("class_name").apply(f).reset_index(level=0, drop=True)
column_list = ['class_name', 'class_size']
df[column_list].head(5)

获取:

This is just the first few rows of data - see how the same class name has the same class count?

2 个答案:

答案 0 :(得分:1)

我认为你需要transform

df['class_size'] = df.groupby('class_name')['class_name'].transform('size')

或者:

df['class_size'] = df.groupby('class_name')['class_name'].transform('count')

What is the difference between size and count in pandas?

答案 1 :(得分:0)

根据您的DataFrame形状,您还可以只计算groupby:

import pandas as pd
df = pd.DataFrame({'class names':list('abracadabra'),'class count':1})
df.groupby('class names').count().reset_index()