我想要一个日期范围,其中每个月的日期都与开始日期相同,例如,如果开始日期是2018-05-16,我想获得['2018-09-15,2018 -10-15,...]
我在Python3中具有以下代码:
(pd.date_range(start=date, periods=12, freq='M') \
+ pd.DateOffset(days=datetime.strptime(date, '%Y-%m-%d').day)).strftime('%d-%m-%Y')
当月份中的某天少于 29 时,它可以正常工作,例如date = '2018-08-31'
输出:
array(['01-10-2018', '31-10-2018', '01-12-2018',
'31-12-2018', '31-01-2019', '03-03-2019',
'31-03-2019', '01-05-2019', '31-05-2019',
'01-07-2019', '31-07-2019', '31-08-2019'], dtype='|S10')
但是,我希望输出为:
array(['30-09-2018', '31-10-2018', '30-11-2018',
'31-12-2018', '31-01-2019', '28-02-2019',
'31-03-2019', '30-04-2019', '31-05-2019',
'30-06-2019', '31-07-2019', '31-08-2019'], dtype='|S10')
答案 0 :(得分:0)
对于在开始日期(或该月的最后一个可行的日期,考虑到不同的月份和leap年的天数)中给出的每月某个特定日期的每月频率的日期范围,此功能应该有效,至少每月一次:
import pandas as pd
def month_range_day(start=None, periods=None):
start_date = pd.Timestamp(start).date()
month_range = pd.date_range(start=start_date, periods=periods, freq='M')
month_day = month_range.day.values
month_day[start_date.day < month_day] = start_date.day
return pd.to_datetime(month_range.year*10000+month_range.month*100+month_day, format='%Y%m%d')
示例1 :
start_date = '2020-01-31'
month_range_day(start=start_date, periods=12)
输出:
DatetimeIndex(['2020-01-31', '2020-02-29', '2020-03-31', '2020-04-30',
'2020-05-31', '2020-06-30', '2020-07-31', '2020-08-31',
'2020-09-30', '2020-10-31', '2020-11-30', '2020-12-31'],
dtype='datetime64[ns]', freq=None)
示例2:
start_date = '2019-01-29'
month_range_day(start=start_date, periods=12)
输出:
DatetimeIndex(['2019-01-29', '2019-02-28', '2019-03-29', '2019-04-29',
'2019-05-29', '2019-06-29', '2019-07-29', '2019-08-29',
'2019-09-29', '2019-10-29', '2019-11-29', '2019-12-29'],
dtype='datetime64[ns]', freq=None)
假设您只需要月末频率,则无需使用pd.DateOffset
:
import pandas as pd
start_date = '2018-09-01'
pd.date_range(start=start_date, periods=12, freq='M').strftime('%d-%m-%Y')
输出:
Index(['30-09-2018', '31-10-2018', '30-11-2018', '31-12-2018', '31-01-2019',
'28-02-2019', '31-03-2019', '30-04-2019', '31-05-2019', '30-06-2019',
'31-07-2019', '31-08-2019'],
dtype='object')
有关更多详细信息,请查看pandas
中的offset aliases。如有必要,更改数据格式和类型应从此处直接进行。
答案 1 :(得分:0)
为什么不仅仅删除第0个元素?
date = '2018-08-31'
(pd.date_range(
start = date,
periods = 12+1,
freq ='M')
).strftime('%d-%m-%Y')[1:]
输出:
Index(['30-09-2018', '31-10-2018', '30-11-2018', '31-12-2018', '31-01-2019',
'28-02-2019', '31-03-2019', '30-04-2019', '31-05-2019', '30-06-2019',
'31-07-2019', '31-08-2019'],
dtype='object')