我有以下数据框(我在正确显示表格时遇到了一些问题,请参阅字典的最后一部分):
account_id contract_id date_activated term_months 2021-01-01 00:00:00 2021-02-01 00:00:00 2021-03-01 00:00:00 2021-04-01 00:00:00 2021-05-01 00:00:00 2021-06-01 00:00:00 2021-07-01 00:00:00 2021-08-01 00:00:00 2021-09-01 00:00:00 2021-10-01 00:00:00 2021-11-01 00:00:00 2021-12-01 00:00:00
0 1 A 2021-01-01 1 200.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1 1 B 2021-02-13 12 0.0 300.0 300.0 300.0 300.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2 1 C 2021-04-06 12 0.0 0.0 0.0 400.0 400.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3 1 I 2020-10-23 6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 150.0 150.0 0.0
4 1 N 2021-11-11 6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 100.0 100.0
5 2 K 2021-01-01 12 100.0 100.0 100.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
6 2 F 2021-03-23 6 0.0 0.0 50.0 50.0 50.0 50.0 50.0 50.0 0.0 0.0 0.0 0.0
我想要如图所示的结果(带有新列 contract_type 和renewal_type):
account_id contract_id date_activated term_months contract_type renewal_type 2021-01-01 00:00:00 2021-02-01 00:00:00 2021-03-01 00:00:00 2021-04-01 00:00:00 2021-05-01 00:00:00 2021-06-01 00:00:00 2021-07-01 00:00:00 2021-08-01 00:00:00 2021-09-01 00:00:00 2021-10-01 00:00:00 2021-11-01 00:00:00 2021-12-01 00:00:00
0 1 A 2021-01-01 1 Original Regular 200.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1 1 B 2021-02-13 12 Upgrade Regular 0.0 300.0 300.0 300.0 300.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2 1 C 2021-04-06 12 Upgrade Early 0.0 0.0 0.0 400.0 400.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3 1 I 2020-10-23 6 Winback Regular 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 150.0 150.0 0.0
4 1 N 2021-11-11 6 Renewal Early 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 100.0 100.0
5 2 K 2021-01-01 12 Original Regular 100.0 100.0 100.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
6 2 F 2021-03-23 6 Renewal Early 0.0 0.0 50.0 50.0 50.0 50.0 50.0 50.0 0.0 0.0 0.0 0.0
您可以在此链接上下载 Excel 文件以获取结果的样本副本:https://drive.google.com/file/d/16BLoSugMaDdB8Qac2ATJRBLRvx3HCIus/view?usp=sharing
每个帐户都有多个合约。我想根据每月的交易量添加两列(从第五列以此类推)。
续订类型应为“常规”或“提前”。当它是“原始”或“赢回”合同时,它是“常规”。如果之前的合同已于上个月结束并且新合同的付款是在下个月,则它也是“常规”的。当为同一个帐户签订新合同但前一个合同尚未到期或结束其期限(基于 term_months)时,这是“早”。
合同类型应为“原始”、“续订”、“升级”或“赢回”。如果它是该帐户的第一个合同,则为“原始”。如果在前一份合同没有付款/交易 4 个月后签订新合同,则为“赢回”。如果它不是“Winback”并且新合同的付款比以前的合同多,则它是“升级”。如果它不属于“原始”、“升级”或“赢回”,则视为“续订”。
尝试使用此代码执行此操作,但存在一些问题,因为它将某些“原始”归类为“Winback”(对于 contract_type)而将某些“早期”归类为“常规”(对于renewal_type):
def get_types(monthly_payments):
def f(s):
check = monthly_payments.loc[
(s.date_activated.year == monthly_payments.index.year) &
(s.date_activated.month == monthly_payments.index.month)
].iloc[0]
if check.wb == 0:
# If rolling sum of 4 months prior is 0
s['contract_type'] = 'Winback'
elif check.og_upg == 0:
# If Prior Month is 0
s['contract_type'] = 'Original'
elif check.max_pmt > check.og_upg:
# If Prior Month is not missing and current month is more
s['contract_type'] = 'Upgrade'
else:
s['contract_type'] = 'Renewal'
if check.early:
# If Early
s['renewal_type'] = 'Early'
else:
s['renewal_type'] = 'Regular'
return s
return f
def apply_types(g):
# Get Non Payment Info
account_info = g[g.columns[:4]]
# Transpose Monthly Payments To Rows
monthly_payments = g.loc[:, g.columns[4:]].T
# Make Sure Index is DT
monthly_payments.index = pd.to_datetime(monthly_payments.index)
# Get Check for is early based on number of payments
monthly_payments['early'] = monthly_payments.astype(bool).sum(axis=1) > 1
# Max Payment In Month
monthly_payments['max_pmt'] = monthly_payments.max(axis=1)
# 1 Month Prior
monthly_payments['og_upg'] = monthly_payments.max_pmt.shift().fillna(0)
# Rolling Sum of 4 Months Prior
monthly_payments['wb'] = monthly_payments.max_pmt \
.rolling(min_periods=0, window=4).sum().shift()
# Concat New Columns With Original Payment Information
return pd.concat((
account_info.apply(get_types(monthly_payments), axis=1),
g[g.columns[4:]]
), axis=1)
df = df.groupby('account_id', as_index=False).apply(apply_types).reset_index(drop=True)
这是数据框的字典:
{'account_id': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 2, 6: 2},
'contract_id': {0: 'A', 1: 'B', 2: 'C', 3: 'I', 4: 'N', 5: 'K', 6: 'F'},
'date_activated': {0: Timestamp('2021-01-01 00:00:00'),
1: Timestamp('2021-02-13 00:00:00'),
2: Timestamp('2021-04-06 00:00:00'),
3: Timestamp('2020-10-23 00:00:00'),
4: Timestamp('2021-11-11 00:00:00'),
5: Timestamp('2021-01-01 00:00:00'),
6: Timestamp('2021-03-23 00:00:00')},
'term_months': {0: 1, 1: 12, 2: 12, 3: 6, 4: 6, 5: 12, 6: 6},
datetime.datetime(2021, 1, 1, 0, 0): {0: 200.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 100.0,
6: 0.0},
datetime.datetime(2021, 2, 1, 0, 0): {0: 0.0,
1: 300.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 100.0,
6: 0.0},
datetime.datetime(2021, 3, 1, 0, 0): {0: 0.0,
1: 300.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 100.0,
6: 50.0},
datetime.datetime(2021, 4, 1, 0, 0): {0: 0.0,
1: 300.0,
2: 400.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 50.0},
datetime.datetime(2021, 5, 1, 0, 0): {0: 0.0,
1: 300.0,
2: 400.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 50.0},
datetime.datetime(2021, 6, 1, 0, 0): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 50.0},
datetime.datetime(2021, 7, 1, 0, 0): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 50.0},
datetime.datetime(2021, 8, 1, 0, 0): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 50.0},
datetime.datetime(2021, 9, 1, 0, 0): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 0.0},
datetime.datetime(2021, 10, 1, 0, 0): {0: 0.0,
1: 0.0,
2: 0.0,
3: 150.0,
4: 0.0,
5: 0.0,
6: 0.0},
datetime.datetime(2021, 11, 1, 0, 0): {0: 0.0,
1: 0.0,
2: 0.0,
3: 150.0,
4: 100.0,
5: 0.0,
6: 0.0},
datetime.datetime(2021, 12, 1, 0, 0): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 100.0,
5: 0.0,
6: 0.0}}
这是结果的字典:
{'account_id': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 2, 6: 2},
'contract_id': {0: 'A', 1: 'B', 2: 'C', 3: 'I', 4: 'N', 5: 'K', 6: 'F'},
'date_activated': {0: Timestamp('2021-01-01 00:00:00'),
1: Timestamp('2021-02-13 00:00:00'),
2: Timestamp('2021-04-06 00:00:00'),
3: Timestamp('2020-10-23 00:00:00'),
4: Timestamp('2021-11-11 00:00:00'),
5: Timestamp('2021-01-01 00:00:00'),
6: Timestamp('2021-03-23 00:00:00')},
'term_months': {0: 1, 1: 12, 2: 12, 3: 6, 4: 6, 5: 12, 6: 6},
'contract_type': {0: 'Original',
1: 'Upgrade',
2: 'Upgrade',
3: 'Winback',
4: 'Renewal',
5: 'Original',
6: 'Renewal'},
'renewal_type': {0: 'Regular',
1: 'Regular',
2: 'Early',
3: 'Regular',
4: 'Early',
5: 'Regular',
6: 'Early'},
datetime.datetime(2021, 1, 1, 0, 0): {0: 200.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 100.0,
6: 0.0},
datetime.datetime(2021, 2, 1, 0, 0): {0: 0.0,
1: 300.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 100.0,
6: 0.0},
datetime.datetime(2021, 3, 1, 0, 0): {0: 0.0,
1: 300.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 100.0,
6: 50.0},
datetime.datetime(2021, 4, 1, 0, 0): {0: 0.0,
1: 300.0,
2: 400.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 50.0},
datetime.datetime(2021, 5, 1, 0, 0): {0: 0.0,
1: 300.0,
2: 400.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 50.0},
datetime.datetime(2021, 6, 1, 0, 0): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 50.0},
datetime.datetime(2021, 7, 1, 0, 0): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 50.0},
datetime.datetime(2021, 8, 1, 0, 0): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 50.0},
datetime.datetime(2021, 9, 1, 0, 0): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 0.0},
datetime.datetime(2021, 10, 1, 0, 0): {0: 0.0,
1: 0.0,
2: 0.0,
3: 150.0,
4: 0.0,
5: 0.0,
6: 0.0},
datetime.datetime(2021, 11, 1, 0, 0): {0: 0.0,
1: 0.0,
2: 0.0,
3: 150.0,
4: 100.0,
5: 0.0,
6: 0.0},
datetime.datetime(2021, 12, 1, 0, 0): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 100.0,
5: 0.0,
6: 0.0}}