我想在帐户ID上执行groupby,然后在分组后执行值计数,并将其计数作为新列。 我怎么能在熊猫里做到这一点。
例如:
Account Id Values
1 Open
2 Closed
1 Open
3 Closed
2 Open
输出必须是:
Account Id Open Closed
1 2 0
2 1 1
3 0 1
答案 0 :(得分:0)
使用groupby
和value_counts
获取您想要的初始计数。然后取消堆叠多索引以获取DataFrame并将空值设置为0以获得最终结果:
import pandas as pd
# Defining DataFrame
df = pd.DataFrame(index=range(5))
df['Account Id'] = [1, 2, 1, 3, 2]
df['Values'] = ['Open', 'Closed', 'Open', 'Closed', 'Open']
grouped = df.groupby('Account Id')['Values'].value_counts()
# Remove the multiindex present
grouped = grouped.unstack()
# Set null values to 0
result = grouped.where(pd.notnull(grouped), 0)
结果输出:
Closed Open
Account Id
1 0 2
2 1 1
3 1 0
(抱歉,我不确定如何正确表示DataFrame)
答案 1 :(得分:0)
这也会返回groupby对象的数据帧:
grouped_df = df.groupby(["Account Id","Values"])
grouped_df.size().reset_index(name = "Count")