我有一个数据框,其中包含分配给帐户(account_id)的贷款(loan_id),持续时间(loan_duration)和每月的贷款还款(monthly_loan_payment)。
最终,我想提取每个客户每个月的月付款总额。为了到达那里,我试图提取一个数据框,该框为我提供了account_id,每笔贷款的月份和每月还款期以及期限的每个月。假设在07/1993年发放了一笔贷款,每月还款额为1000 $,有效期为12个月,我想返回该行,其中包含12个月的每一个月的account_id,loan_id和每月还款信息持续时间。 df中的每笔贷款都一样。
我尝试了df.groupby('account_id').apply(lambda x: x['date'] + pd.DateOffset(months = x['loan_duration'], axis=1)['monthly_payment']
,但没有成功。如何在同时复制其他列的内容的同时对每一行进行日期偏移?
答案 0 :(得分:1)
您可以为每笔贷款创建pd.date_range
,并使用df.explode
来获取所有的单笔付款。
# sample data
# please always provide a callable line of code with your data
# you can get it with `df.head().to_dict('split')`
df = pd.DataFrame({
'account_id': [1, 1, 2, 3, 3],
'loan_id': [1, 2, 3, 4, 5],
'date': ['1993-07-01', '1993-08-01', '1993-09-01', '1993-09-01', '1993-09-01'],
'loan_duration_months': [12, 6, 5, 10, 10],
'monthly_payment': [1000, 500, 1000, 1000, 1000]
})
df['date'] = pd.to_datetime(df['date'])
df['payment_date'] = [
pd.date_range(start, periods=duration, freq='M')
for start, duration in zip(df['date'], df['loan_duration_months'])
]
df = df.explode('payment_date', ignore_index=True)
输出
account_id loan_id date loan_duration_months monthly_payment payment_date
0 1 1 1993-07-01 12 1000 1993-07-31
1 1 1 1993-07-01 12 1000 1993-08-31
2 1 1 1993-07-01 12 1000 1993-09-30
3 1 1 1993-07-01 12 1000 1993-10-31
4 1 1 1993-07-01 12 1000 1993-11-30
5 1 1 1993-07-01 12 1000 1993-12-31
6 1 1 1993-07-01 12 1000 1994-01-31
7 1 1 1993-07-01 12 1000 1994-02-28
8 1 1 1993-07-01 12 1000 1994-03-31
9 1 1 1993-07-01 12 1000 1994-04-30
10 1 1 1993-07-01 12 1000 1994-05-31
11 1 1 1993-07-01 12 1000 1994-06-30
12 1 2 1993-08-01 6 500 1993-08-31
13 1 2 1993-08-01 6 500 1993-09-30
14 1 2 1993-08-01 6 500 1993-10-31
15 1 2 1993-08-01 6 500 1993-11-30
16 1 2 1993-08-01 6 500 1993-12-31
17 1 2 1993-08-01 6 500 1994-01-31
18 2 3 1993-09-01 5 1000 1993-09-30
19 2 3 1993-09-01 5 1000 1993-10-31
20 2 3 1993-09-01 5 1000 1993-11-30
21 2 3 1993-09-01 5 1000 1993-12-31
22 2 3 1993-09-01 5 1000 1994-01-31
23 3 4 1993-09-01 10 1000 1993-09-30
24 3 4 1993-09-01 10 1000 1993-10-31
25 3 4 1993-09-01 10 1000 1993-11-30
26 3 4 1993-09-01 10 1000 1993-12-31
27 3 4 1993-09-01 10 1000 1994-01-31
28 3 4 1993-09-01 10 1000 1994-02-28
29 3 4 1993-09-01 10 1000 1994-03-31
30 3 4 1993-09-01 10 1000 1994-04-30
31 3 4 1993-09-01 10 1000 1994-05-31
32 3 4 1993-09-01 10 1000 1994-06-30
33 3 5 1993-09-01 10 1000 1993-09-30
34 3 5 1993-09-01 10 1000 1993-10-31
35 3 5 1993-09-01 10 1000 1993-11-30
36 3 5 1993-09-01 10 1000 1993-12-31
37 3 5 1993-09-01 10 1000 1994-01-31
38 3 5 1993-09-01 10 1000 1994-02-28
39 3 5 1993-09-01 10 1000 1994-03-31
40 3 5 1993-09-01 10 1000 1994-04-30
41 3 5 1993-09-01 10 1000 1994-05-31
42 3 5 1993-09-01 10 1000 1994-06-30