我有一个每周的日期作为范围,我想计算该时间段之间的缺失日期。日期范围从1992年开始,因此不能手动设置。日期数据在Excel中以以下格式提供。
1992-12-18
1992-12-25
1993-01-08
当我通过将前两个日期作为开始和结束来运行以下代码时,我得到了正确的结果。
我试图将此日期转换为pd.to_datetime(dates [0])。dt.date 也是pd.to_datetime(dates [0])。dt.normalize()
import datetime
import pandas as pd
data = pd.read_excel("-------------------",header=None)
for t in data[0]:
start = datetime.datetime.strptime(str(t), %Y-%m-%d %H:%M:%S")
end = datetime.datetime.strptime(str(t+1), %Y-%m-%d %H:%M:%S")
date = (start + datetime.timedelta(days = x) for x in range(0,(end-
start).days))
for data_ob in date:
print(data_ob.strftime("%Y-%m-%d"))
ValueError:如果没有频率,则无法向时间戳添加整数值
答案 0 :(得分:1)
这是一个解决方案,可以使用datetime
和calendar
模块获取介于日期范围之间的所有缺失日期,而无需返回重复的日期,也无需返回在输入日期列表中找到的日期:< / p>
from datetime import datetime
from calendar import monthrange
from pprint import pprint
def get_missing_dates(dates: list) -> list:
"""Find missing dates"""
out = set()
for date in dates:
_date = datetime.strptime(date, '%Y-%m-%d')
year, month, day = _date.year, _date.month, _date.day
for missing in range(*monthrange(year, month)):
to_add = datetime(year, month, missing).strftime('%Y-%m-%d')
if date not in out and not day == missing and to_add not in dates:
out.add(to_add)
return sorted(list(out))
dates = ['1992-12-18', '1992-12-25']
missing_dates = get_missing_dates(dates)
pprint(missing_dates)
输出:
['1992-12-01',
'1992-12-02',
'1992-12-03',
'1992-12-04',
'1992-12-05',
'1992-12-06',
'1992-12-07',
'1992-12-08',
'1992-12-09',
'1992-12-10',
'1992-12-11',
'1992-12-12',
'1992-12-13',
'1992-12-14',
'1992-12-15',
'1992-12-16',
'1992-12-17',
'1992-12-19',
'1992-12-20',
'1992-12-21',
'1992-12-22',
'1992-12-23',
'1992-12-24',
'1992-12-26',
'1992-12-27',
'1992-12-28',
'1992-12-29',
'1992-12-30']