我有一个日期列表,例如:
['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08']
如何找到这些日期中包含的连续日期范围?在上面的示例中,范围应为:
[{"start_date": '2011-02-27', "end_date": '2011-03-01'},
{"start_date": '2011-04-12', "end_date": '2011-04-13'},
{"start_date": '2011-06-08', "end_date": '2011-06-08'}
]
感谢。
答案 0 :(得分:7)
这有效,但是我对它不满意,将使用更清洁的解决方案来编辑答案。完成后,这是一个干净,有效的解决方案:
import datetime
import pprint
def parse(date):
return datetime.date(*[int(i) for i in d.split('-')])
def get_ranges(dates):
while dates:
end = 1
try:
while dates[end] - dates[end - 1] == datetime.timedelta(days=1):
end += 1
except IndexError:
pass
yield {
'start-date': dates[0],
'end-date': dates[end-1]
}
dates = dates[end:]
dates = [
'2011-02-27', '2011-02-28', '2011-03-01',
'2011-04-12', '2011-04-13',
'2011-06-08'
]
# Parse each date and convert it to a date object. Also ensure the dates
# are sorted, you can remove 'sorted' if you don't need it
dates = sorted([parse(d) for d in dates])
pprint.pprint(list(get_ranges(dates)))
相对输出:
[{'end-date': datetime.date(2011, 3, 1),
'start-date': datetime.date(2011, 2, 27)},
{'end-date': datetime.date(2011, 4, 13),
'start-date': datetime.date(2011, 4, 12)},
{'end-date': datetime.date(2011, 6, 8),
'start-date': datetime.date(2011, 6, 8)}]
答案 1 :(得分:0)
尝试忍者GaretJax的编辑:;)
def date_to_number(date):
return datetime.date(*[int(i) for i in date.split('-')]).toordinal()
def number_to_date(number):
return datetime.date.fromordinal(number).strftime('%Y-%m-%d')
def day_ranges(dates):
day_numbers = set(date_to_number(d) for d in dates)
start = None
# We loop including one element guaranteed not to be in the set, to force the
# closing of any range that's currently open.
for n in xrange(min(day_numbers), max(day_numbers) + 2):
if start == None:
if n in day_numbers: start = n
else:
if n not in day_numbers:
yield {
'start_date': number_to_date(start),
'end_date': number_to_date(n - 1)
}
start = None
list(
day_ranges([
'2011-02-27', '2011-02-28', '2011-03-01',
'2011-04-12', '2011-04-13', '2011-06-08'
])
)
答案 2 :(得分:0)
from datetime import datetime, timedelta
dates = ['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08']
d = [datetime.strptime(date, '%Y-%m-%d') for date in dates]
test = lambda x: x[1] - x[0] != timedelta(1)
slices = [0] + [i+1 for i, x in enumerate(zip(d, d[1:])) if test(x)] + [len(dates)]
ranges = [{"start_date": dates[s], "end_date": dates[e-1]} for s, e in zip(slices, slices[1:])]
结果如下:
>>> pprint.pprint(ranges)
[{'end_date': '2011-03-01', 'start_date': '2011-02-27'},
{'end_date': '2011-04-13', 'start_date': '2011-04-12'},
{'end_date': '2011-06-08', 'start_date': '2011-06-08'}]
slices
列表推导获得前一个日期不是当前日期前一天的所有索引。将0
添加到前面,len(dates)
添加到结尾,每个日期范围都可以描述为dates[slices[i]:slices[i+1]-1]
。
答案 3 :(得分:0)
我对主题的轻微变化(我最初构建了开始/结束列表并压缩它们以返回元组,但我更喜欢@Karl Knechtel的生成器方法):
from datetime import date, timedelta
ONE_DAY = timedelta(days=1)
def find_date_windows(dates):
# guard against getting empty list
if not dates:
return
# convert strings to sorted list of datetime.dates
dates = sorted(date(*map(int,d.split('-'))) for d in dates)
# build list of window starts and matching ends
lastStart = lastEnd = dates[0]
for d in dates[1:]:
if d-lastEnd > ONE_DAY:
yield {'start_date':lastStart, 'end_date':lastEnd}
lastStart = d
lastEnd = d
yield {'start_date':lastStart, 'end_date':lastEnd}
以下是测试用例:
tests = [
['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08'],
['2011-06-08'],
[],
['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08', '2011-06-10'],
]
for dates in tests:
print dates
for window in find_date_windows(dates):
print window
print
打印:
['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08']
{'start_date': datetime.date(2011, 2, 27), 'end_date': datetime.date(2011, 3, 1)}
{'start_date': datetime.date(2011, 4, 12), 'end_date': datetime.date(2011, 4, 13)}
{'start_date': datetime.date(2011, 6, 8), 'end_date': datetime.date(2011, 6, 8)}
['2011-06-08']
{'start_date': datetime.date(2011, 6, 8), 'end_date': datetime.date(2011, 6, 8)}
[]
['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08', '2011-06-10']
{'start_date': datetime.date(2011, 2, 27), 'end_date': datetime.date(2011, 3, 1)}
{'start_date': datetime.date(2011, 4, 12), 'end_date': datetime.date(2011, 4, 13)}
{'start_date': datetime.date(2011, 6, 8), 'end_date': datetime.date(2011, 6, 8)}
{'start_date': datetime.date(2011, 6, 10), 'end_date': datetime.date(2011, 6, 10)}
答案 4 :(得分:0)
这是一个替代解决方案:它返回(开始,结束)的列表元组,因为这就是我需要的;)。
这会改变列表,所以我需要复制一份。显然,这会增加内存使用量。我怀疑list.pop()不是超级高效的,但这可能取决于python中list的实现。
def collapse_dates(date_list):
if not date_list:
return date_list
result = []
# We are going to alter the list, so create a (sorted) copy.
date_list = sorted(date_list)
while len(date_list):
# Grab the first item: this is both the start and end of the range.
start = current = date_list.pop(0)
# While the first item in the list is the next day, pop that and
# set it to the end of the range.
while len(date_list) and date_list[0] == current + datetime.timedelta(1):
current = date_list.pop(0)
# That's a completed range.
result.append((start,current))
return result
您可以轻松更改附加行以附加dict或yield,而不是附加到列表。
哦,我的假设他们已经约会了。