我有几个客户,每个月的25号都向客户收费。我想找出他们的合同终止前的最后一个结算日期。以下是数据框中的示例:
> data = [['Arthur','2019-03-01'],['Bart','2019-02-26'],['Cindy','2019-02-18'],['Douglas','2019-03-31']]
> df = pd.DataFrame(data, columns = ['Name','Termination Date'])
> df
此外,以下是预期的输出:
> df['Last Billing Date'] =['2019-02-25','2019-02-25','2019-01-25','2019-03-25']
> df
答案 0 :(得分:3)
这是一种方法
s=df['Termination Date'].apply(lambda x : x.replace(day=25))
df['New']=np.where(df['Termination Date']>=s,s,s-pd.DateOffset(months=1))
df
Name Termination Date New
0 Arthur 2019-03-01 2019-02-25
1 Bart 2019-02-26 2019-02-25
2 Cindy 2019-02-18 2019-01-25
3 Douglas 2019-03-31 2019-03-25
答案 1 :(得分:1)
一个简单的解决方案是,如果某天在25
之前减去一个月:
import datetime
def last_billing(termination_dt):
if isinstance(termination_dt, str): # check if not in datetime format
termination_dt = datetime.datetime.strptime(termination_dt, '%Y-%m-%d')
if termination_dt.day < 25:
return termination_dt.replace(day=25, month=termination_dt.month-1)
return termination_dt.replace(day=25)
df['Last Billing Date'] = df['Termination Date'].apply(last_billing)
Name Termination Date Last Billing Date
0 Arthur 2019-03-01 2019-02-25
1 Bart 2019-02-26 2019-02-25
2 Cindy 2019-02-18 2019-01-25
3 Douglas 2019-03-31 2019-03-25
如果性能存在问题,请vectorize
使用该功能
import numpy as np
@np.vectorize
def last_billing(termination_dt):
if isinstance(termination_dt, str):
termination_dt = datetime.datetime.strptime(termination_dt, '%Y-%m-%d')
if termination_dt.day < 25:
return termination_dt.replace(day=25, month=termination_dt.month-1)
return termination_dt.replace(day=25)
df['Last Billing Date'] = last_billing(df['Termination Date'])
时间比较:
%timeit df['Last Billing Date'] = df['Termination Date'].apply(last_billing)
## 113 ms ± 365 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit df['Last Billing Date'] = last_billing(df['Termination Date'])
## 108 ms ± 397 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
答案 2 :(得分:1)
如果要以矢量化方式执行此操作:
df['Termination Date'] = pd.to_datetime(df['Termination Date'])
before_25 = df['Termination Date'].dt.day < 25
df.loc[before_25, 'Termination Date'] = df.loc[before_25, 'Termination Date'] + pd.DateOffset(months=-1)
df['Termination Date'].apply(lambda dt: dt.replace(day=25)).values