老化桶分析

时间:2019-12-24 08:24:54

标签: python pandas

我尝试创建具有2019年11月03日截止日期的G / L组明智的日期老化桶(0,30,180,360,720及以上)。

这是我的数据集

G/L          DocDate    Amount_in_local_cur
22161002    11/30/2019  106788990.47
22161002    10/31/2019  75813682.86
22161003    11/30/2019  64342488.02
22161002    9/30/2019   45439306.00
22161003    10/31/2019  42692553.02
22161002    9/30/2019   39513086.49
22161003    10/31/2019  27789087.03
22161003    11/30/2019  25070257.05
22161003    9/30/2019   24139365.38
22161002    8/31/2019   23271726.99
22161002    11/30/2019  22915726.16
22161002    8/31/2019   21424057.20
22161003    9/30/2019   16399392.20
22161002    11/30/2019  12237506.03

我希望像下表一样

G/L           <0   0-29     30-89    90-179   180-364   65-720   >720
22161003      XX     XX      XX       xx       xx         xx      xx   
22161002      xx     xx      xx       xx       xx         xx      xx

1 个答案:

答案 0 :(得分:0)

我不确定您要如何使用截止日期以及您的汇总函数是什么,但是在我看来您需要使用pd.cutpd.pivot_table

import pandas as pd
import numpy as np

df["DocDate"] = pd.to_datetime(df["DocDate"])

cutoff_date = '2019-11-03'
# days from cutoff_date
df["days"] = (df["DocDate"] - pd.Timestamp(cutoff_date)).dt.days

bins = [-np.infty,0,30,180,360,720, np.infty]

df["bins"] = pd.cut(df['days'], bins)

out = pd.pivot_table(df,
                     index=["G/L"],
                     columns=["bins"],
                     values=["Amount_in_local_cur"],
                     aggfunc=np.sum)

# this is just to get rid of multiindex in columns
out.columns = [o[1] for o in out.columns]
print(out)

           (-inf, 0.0]   (0.0, 30.0]
G/L                                 
22161002  2.054619e+08  1.419422e+08
22161003  1.110204e+08  8.941275e+07