Python:如何查找两个日期范围之间的每个月的第一天

时间:2018-07-28 21:42:40

标签: python datetime

我编写了一些代码来创建两个日期范围之间每天的第一天列表。您能想到一种更好的方法吗?

_

4 个答案:

答案 0 :(得分:1)

日历模块为您提供了monthrange方法,该方法使使用timedelta对象进行此操作变得轻松而有效。

import datetime
import calendar

end_date= datetime.date(2018, 03, 28)
start_date= datetime.date(2017, 10, 25)

# include start_date if it is the first
firsts = [start_date] if start_date.day == 1 else []

# normalize start and end date to be the first of the month
start_date = start_date.replace(day=1)
end_date = end_date.replace(day=1)

# inclusive to the last month 
while start_date <= end_date:
    # add the number of days in the month for this month/year
    start_date += datetime.timedelta(calendar.monthrange(start_date.year, start_date.month)[1])
    firsts.append(start_date)

哪些给您以下第一名列表:

[datetime.date(2017, 11, 1), datetime.date(2017, 12, 1), datetime.date(2018, 1, 1), datetime.date(2018, 2, 1), datetime.date(2018, 3, 1)]

答案 1 :(得分:0)

如果使用years * 12 + (month - 1)将年和月转换为一个数字,则对月算术进行推理就容易得多;可以通过下限和模运算将其转换回年份和月份对。例如,2017-10(十月)是从零年开始的24213个月:

>>> 2017 * 12 + (10 - 1)
24213

您可以从该图简单地添加或删除多个月。您可以按楼层划分再次得出年份,然后以%模数并添加回1来找到月份:

>>> 24213 // 12  # year
2017
>>> (24213 % 12) + 1  # month
10

请牢记这一点,然后可以使用range()生成任意数量的月份:

from datetime import date

def months(start_date, end_date, day=1):
    """Produce a date for every month from start until end"""
    start = start_date.year * 12 + (start_date.month - 1)
    if start_date.day > day:
        # already in this month, so start counting at the next
        start += 1
    end = end_date.year * 12 + (end_date.month - 1)
    if end_date.day > day:
        # end date is past the reference day, include the reference
        # date in the output
        end += 1
    # generate the months, just a range from start to end
    for ordinal in range(start, end):
        yield date(ordinal // 12, (ordinal % 12) + 1, day)

以上是一个生成器函数,它可以连续产生几个月;如果需要完整的序列,请致电list()

>>> start_date = date(2017, 10, 25)
>>> end_date = date(2018, 3, 28)
>>> list(months(start_date, end_date))
[datetime.date(2017, 11, 1), datetime.date(2017, 12, 1), datetime.date(2018, 1, 1), datetime.date(2018, 2, 1), datetime.date(2018, 3, 1)]

请注意,您根本不需要将日期转换为字符串!您可以使用.month属性从实例中轻松获取月份值。

为了进行比较,我也将其他两种解决方案也转换为发电机:

from calendar import monthrange
from datetime import timedelta
from dateutil import rrule

def andray_timedelta_one(start_date, end_date):
    delta = end_date - start_date
    first_days_of_month = []
    for i in range(delta.days + 1):
        d = start_date + timedelta(i)
        if d.day == 1:
            yield d

def matthew_timedelta_monthrange(start_date, end_date):
    if start_date.day == 1:
        yield start_date

    start_date = start_date.replace(day=1)

    while start_date <= end_date:
        # add the number of days in the month for this month/year
        try:
            start_date += timedelta(monthrange(start_date.year, start_date.month)[1])
            yield start_date
        except OverflowError:
            # trying to add to close-to-date.max would raise this exception
            return

def sunitha_rrule(start_date, end_date):
    # already an iterable
    return rrule.rrule(rrule.MONTHLY, bymonthday=1, dtstart=start_date, until=end_date)

# for completion's sake, I renamed mine to martijn_months

这样,就可以公平地比较它们的性能,并且我们可以使用deque(..., maxlen=0)技巧来快速消耗其输出,而无需占用大量内存。然后,我们可以在date.mindate.max范围(最大可能的日期范围)内运行每个函数。产生了将近12万个日期对象:

>>> sum(1 for _ in months(datetime.date.min, datetime.date.max))
119988

这些是结果:

>>> from timeit import Timer
>>> from collections import deque
>>> bootstrap = 'from __main__ import date, deque, {} as test'
>>> test = 'deque(test(date.min, date.max), maxlen=0)'
>>> for f in (
...         andray_timedelta_one,
...         sunitha_rrule,
...         matthew_timedelta_monthrange,
...         martijn_months):
...     loop_count, total_time = Timer(test, bootstrap.format(f.__name__)).autorange()
...     print(f'{f.__name__:<30}: {total_time/loop_count*1000:.5f}ms')
...
andray_timedelta_one          : 2001.27048ms
sunitha_rrule                 : 1517.70081ms
matthew_timedelta_monthrange  : 154.68727ms
martijn_months                : 38.86803ms

如您所见,我的方法要快几个数量级。

  • Andray的方法浪费了大量时间来创建日历中的每个日期,每次添加一天。
  • Sunitha所采用的rrule方法虽然简洁明了,但是该函数必须解决更为复杂的日期算术,因此这种简单情况没有得到优化。这会使rrule()
  • Matthew的效率要高得多,但是对于一年月组合上的简单加一运算,calendar.monthrange()执行的计算仍然过高。我们不需要知道当前月份是否有31、30、29或28天来进行计算!

答案 2 :(得分:0)

使用dateutil模块rrule子模块可以更方便地重复日期/时间。您可以通过执行pip install python-dateutil

进行安装
>>> from dateutil import rrule, parser
>>> start = parser.parse('Jan 10 2017')
>>> end   = parser.parse('Mar 5 2018')
>>> list(rrule.rrule(rrule.MONTHLY, bymonthday=1, dtstart=start, until=end))
[datetime.datetime(2017, 2, 1, 0, 0), datetime.datetime(2017, 3, 1, 0, 0), datetime.datetime(2017, 4, 1, 0, 0), datetime.datetime(2017, 5, 1, 0, 0), datetime.datetime(2017, 6, 1, 0, 0), datetime.datetime(2017, 7, 1, 0, 0), datetime.datetime(2017, 8, 1, 0, 0), datetime.datetime(2017, 9, 1, 0, 0), datetime.datetime(2017, 10, 1, 0, 0), datetime.datetime(2017, 11, 1, 0, 0), datetime.datetime(2017, 12, 1, 0, 0), datetime.datetime(2018, 1, 1, 0, 0), datetime.datetime(2018, 2, 1, 0, 0), datetime.datetime(2018, 3, 1, 0, 0)]

答案 3 :(得分:-1)

datetime类支持算术运算(您可以执行+-等)。如果将其与timedelta结合使用,则可以整天start_dateend_date之间。然后,搜索月份的第一天很容易:

import datetime
start_date= datetime.datetime.strptime('2017-10-25', "%Y-%m-%d").date()
end_date= datetime.datetime.strptime('2018-03-28', "%Y-%m-%d").date()

delta = end_date - start_date

first_days_of_month = []
for i in range(delta.days + 1):
    d = start_date + datetime.timedelta(i)
    if d.day == 1:
        first_days_of_month.append(d)

print('start date =', start_date)
print('end date =', end_date)
for d in first_days_of_month:
    print(d, end=' ')
print()

打印:

start date = 2017-10-25
end date = 2018-03-28
2017-11-01 2017-12-01 2018-01-01 2018-02-01 2018-03-01