我有一个数据框:
recurring_credits =
narration values month_year
4 bupi upi True Sept-2019
5 bupi upi False Oct-2019
9 bupi upi False December-2019
11 merv visi True December-2019
12 neft pad True December-2019
17 bupi upi False December-2019
22 bupi upi False Oct-2019
27 bupi upi False December-2019
31 bupi upi False December-2019
32 bupi upi True Sept-2019
36 neft pad True Sept-2019
40 bupi upi False December-2019
44 bupi upi False December-2019
48 bupi upi False December-2019
49 bupi upi False December-2019
51 bupi upi False December-2019
53 imps bok True December-2019
58 imps bok True December-2019
60 bupi upi False December-2019
67 neft pad True January-2020
我必须为唯一的叙述创建每月交易的列数,并且只有真实值,否则为零。 我的输出df应该如下所示
df_out =
narration values month_year tran/month
4 bupi upi True Sept-2019 2
5 bupi upi False Oct-2019 0
9 bupi upi False December-2019 0
11 merv visi True December-2019 1
12 neft pad True December-2019 1
17 bupi upi False December-2019 0
22 bupi upi False Oct-2019 0
27 bupi upi False December-2019 0
31 bupi upi False December-2019 0
32 bupi upi True Sept-2019 2
36 neft pad True Sept-2019 1
40 bupi upi False December-2019 0
44 bupi upi False December-2019 0
48 bupi upi False December-2019 0
49 bupi upi False December-2019 0
51 bupi upi False December-2019 0
53 imps bok True December-2019 2
58 imps bok True December-2019 2
60 bupi upi False December-2019 0
67 neft pad True January-2020 1
我已经尝试过了,但是无法获得正确的输出:
unique_narration = list(recurring_credits['narration'].unique())
for narration in unique_narration:
d = recurring_credits.loc[(recurring_credits['narration']==narration)&(recurring_credits['values']==True)]
rec_pat = d.groupby('month_year', as_index=True).agg({'narration':'nunique'}).reset_index()
rec_pat.columns = ['month_year','recurrance_number']
recurring_credits['recurrance_pattern']=np.nan
for i,j in zip(rec_pat.transaction_month_year,rec_pat.recurrance_number):
recurring_credits['recurrance_pattern'].loc[(recurring_credits['narration']==narration)&(recurring_credits['month_year']==i)&(recurring_credits['values']==True)]=j
答案 0 :(得分:2)
您需要用Series.where
将narration
行的NaN
替换为False
,然后将GroupBy.transform
用于新列,并用{{ 3}}来计算非缺失值:
s = (df.assign(new = df['narration'].where(df['values']))
.groupby(['month_year','narration'])['new']
.transform('count'))
df['tran/month'] = s
print (df)
narration values month_year tran/month
4 bupi upi True Sept-2019 2
5 bupi upi False Oct-2019 0
9 bupi upi False December-2019 0
11 merv visi True December-2019 1
12 neft pad True December-2019 1
17 bupi upi False December-2019 0
22 bupi upi False Oct-2019 0
27 bupi upi False December-2019 0
31 bupi upi False December-2019 0
32 bupi upi True Sept-2019 2
36 neft pad True Sept-2019 1
40 bupi upi False December-2019 0
44 bupi upi False December-2019 0
48 bupi upi False December-2019 0
49 bupi upi False December-2019 0
51 bupi upi False December-2019 0
53 imps bok True December-2019 2
58 imps bok True December-2019 2
60 bupi upi False December-2019 0
67 neft pad True January-2020 1
答案 1 :(得分:1)
您可以将其分为几个步骤:
usable_counts = (
df.loc[df["values"]]
.groupby(["month_year"])["narration"]
.value_counts()
.rename("tran/month")
)
print(usable_counts)
month_year narration
December-2019 imps bok 2
merv visi 1
neft pad 1
January-2020 neft pad 1
Sept-2019 bupi upi 2
neft pad 1
Name: tran/month, dtype: int64
现在,我们有了每月/叙事的计数,我们可以将其合并回原始数据框并清理最终结果:
final_df = (
df.merge(
usable_counts,
left_on=["month_year", "narration"],
right_index=True,
how="left")
.fillna(0)
.astype({"tran/month": int})
)
print(final_df)
narration values month_year tran/month
4 bupi upi True Sept-2019 2
5 bupi upi False Oct-2019 0
9 bupi upi False December-2019 0
11 merv visi True December-2019 1
12 neft pad True December-2019 1
17 bupi upi False December-2019 0
22 bupi upi False Oct-2019 0
27 bupi upi False December-2019 0
31 bupi upi False December-2019 0
32 bupi upi True Sept-2019 2
36 neft pad True Sept-2019 1
40 bupi upi False December-2019 0
44 bupi upi False December-2019 0
48 bupi upi False December-2019 0
49 bupi upi False December-2019 0
51 bupi upi False December-2019 0
53 imps bok True December-2019 2
58 imps bok True December-2019 2
60 bupi upi False December-2019 0
67 neft pad True January-2020 1