I have a function to compute the number of days between two dates using a 360 day year (if only it was just a 365 days difference lol).
def day_count_30_360 (start_date, end_date):
"""Returns number of days between start_date and end_date, using Thirty/360 convention"""
d1 = min(30, start_date.day)
d2 = min(d1, end_date.day) if d1 == 30 else end_date.day
return 360 * (end_date.year - start_date.year)\
+ 30 * (end_date.month - start_date.month)\
+ d2 - d1
I am currently running a for loop to run each value but this is terribly slow.
for col in range(len(df_start_dt.columns)):
for row in range(len(df_start_dt.index)):
df_out.iloc[row, col] = day_count_30_360(df_start_dt.iloc[row, col], df_end_dt.iloc[row, col])
Is there any way to run both dataframes through the same function without looping? Thanks!
Example of dataframe:
Created dummy df for testing:
df_start_dt = pd.DataFrame([[pd.datetime(2004,1,1),pd.datetime(2004,1,1),pd.datetime(2004,1,1)], [pd.datetime(2004,2,2),pd.datetime(2004,2,2),pd.datetime(2004,2,2)]])
df_end_dt = pd.DataFrame([[pd.datetime(2005,1,1),pd.datetime(2005,1,1),pd.datetime(2005,1,1)], [pd.datetime(2005,2,2),pd.datetime(2005,2,2),pd.datetime(2006,2,2)]])
Both dataframes have the same index, headers, dimensions
答案 0 :(得分:1)
df = pd.concat([df_start_dt, df_end_dt], keys=['a','b'])
df = df.groupby(level=1).agg(lambda x: day_count_30_360(x.iat[0], x.iat[-1]))
print (df)
0 1 2
0 360 360 360
1 360 360 720
另一种改变功能的解决方案:
def day_count_30_360 (x):
"""Returns number of days between start_date and end_date, using Thirty/360 convention"""
start_date = x.iat[0]
end_date = x.iat[-1]
d1 = min(30, start_date.day)
d2 = min(d1, end_date.day) if d1 == 30 else end_date.day
return 360 * (end_date.year - start_date.year)\
+ 30 * (end_date.month - start_date.month)\
+ d2 - d1
df = pd.concat([df_start_dt, df_end_dt], keys=['a','b'])
df = df.groupby(level=1).agg(day_count_30_360)
print (df)
0 1 2
0 360 360 360
1 360 360 720