尝试查找日期范围之间的缺失日期

时间:2019-08-27 14:30:29

标签: python-3.x

我有一个每周的日期作为范围,我想计算该时间段之间的缺失日期。日期范围从1992年开始,因此不能手动设置。日期数据在Excel中以以下格式提供。

1992-12-18

1992-12-25

1993-01-08

当我通过将前两个日期作为开始和结束来运行以下代码时,我得到了正确的结果。

我试图将此日期转换为pd.to_datetime(dates [0])。dt.date 也是pd.to_datetime(dates [0])。dt.normalize()

import datetime
import pandas as pd
data = pd.read_excel("-------------------",header=None)
for t in data[0]:
    start = datetime.datetime.strptime(str(t), %Y-%m-%d %H:%M:%S")
    end = datetime.datetime.strptime(str(t+1), %Y-%m-%d %H:%M:%S")
    date = (start + datetime.timedelta(days = x) for x in range(0,(end- 
start).days))
for data_ob in date:
    print(data_ob.strftime("%Y-%m-%d"))

ValueError:如果没有频率,则无法向时间戳添加整数值

1 个答案:

答案 0 :(得分:1)

这是一个解决方案,可以使用datetimecalendar模块获取介于日期范围之间的所有缺失日期,而无需返回重复的日期,也无需返回在输入日期列表中找到的日期:< / p>

from datetime import datetime
from calendar import monthrange
from pprint import pprint


def get_missing_dates(dates: list) -> list:
    """Find missing dates"""
    out = set()
    for date in dates:
        _date = datetime.strptime(date, '%Y-%m-%d')
        year, month, day = _date.year, _date.month, _date.day
        for missing in range(*monthrange(year, month)):
            to_add = datetime(year, month, missing).strftime('%Y-%m-%d')
            if date not in out and not day == missing and to_add not in dates:
                out.add(to_add)
    return sorted(list(out))


dates = ['1992-12-18', '1992-12-25']
missing_dates = get_missing_dates(dates)
pprint(missing_dates)

输出:

['1992-12-01',
 '1992-12-02',
 '1992-12-03',
 '1992-12-04',
 '1992-12-05',
 '1992-12-06',
 '1992-12-07',
 '1992-12-08',
 '1992-12-09',
 '1992-12-10',
 '1992-12-11',
 '1992-12-12',
 '1992-12-13',
 '1992-12-14',
 '1992-12-15',
 '1992-12-16',
 '1992-12-17',
 '1992-12-19',
 '1992-12-20',
 '1992-12-21',
 '1992-12-22',
 '1992-12-23',
 '1992-12-24',
 '1992-12-26',
 '1992-12-27',
 '1992-12-28',
 '1992-12-29',
 '1992-12-30']