我有一个数据框,看起来像:
SK_ID_CURR CREDIT_ACTIVE
0 215354 Closed
1 215354 Active
2 215354 Active
3 215354 Active
4 215354 Active
5 215354 Active
6 215354 Active
7 162297 Closed
8 162297 Closed
9 162297 Active
我想汇总每个ID的有效信用和封闭信用的数量,然后为Active_credits
,Closed_credits
创建一个新列,并列出每个ID对应的有效信用和封闭信用的数量。
答案 0 :(得分:2)
您可以使用pandas.crosstab
,从而避免了建议的中介步骤:
res = pd.crosstab(df['SK_ID_CURR'], df['CREDIT_ACTIVE'])
print(res)
CREDIT_ACTIVE Active Closed
SK_ID_CURR
162297 1 2
215354 6 1
答案 1 :(得分:1)
您可以使用pd.DataFrame.groupby
df1.groupby(['SK_ID_CURR','CREDIT_ACTIVE']).size()
输出:
SK_ID_CURR CREDIT_ACTIVE
162297 Active 1
Closed 2
215354 Active 6
Closed 1