我尝试合并重叠的日期时间范围。我将日期时间范围列表作为列表中的元组:
data = [(datetime.datetime(2016, 1, 10, 13, 0), datetime.datetime(2016, 1, 10, 16, 0)), (datetime.datetime(2016, 1, 10, 14, 0), datetime.datetime(2016, 1, 10, 14, 0)), (datetime.datetime(2016, 1, 10, 22, 0), datetime.datetime(2016, 1, 10, 22, 0)), (datetime.datetime(2016, 1, 10, 23, 0), datetime.datetime(2016, 1, 11, 0, 30)), (datetime.datetime(2016, 1, 11, 2, 30), datetime.datetime(2016, 1, 11, 3, 30)), (datetime.datetime(2016, 1, 11, 13, 0), datetime.datetime(2016, 1, 11, 16, 0)), (datetime.datetime(2016, 1, 11, 14, 0), datetime.datetime(2016, 1, 11, 14, 0)), (datetime.datetime(2016, 1, 11, 20, 30), datetime.datetime(2016, 1, 11, 21, 30)), (datetime.datetime(2016, 1, 11, 22, 0), datetime.datetime(2016, 1, 11, 22, 0)), (datetime.datetime(2016, 1, 12, 2, 30), datetime.datetime(2016, 1, 12, 3, 30)), (datetime.datetime(2016, 1, 12, 13, 0), datetime.datetime(2016, 1, 12, 16, 0)), (datetime.datetime(2016, 1, 12, 14, 0), datetime.datetime(2016, 1, 12, 14, 0)), (datetime.datetime(2016, 1, 12, 19, 30), datetime.datetime(2016, 1, 12, 20, 30)), (datetime.datetime(2016, 1, 12, 22, 0), datetime.datetime(2016, 1, 12, 22, 0)), (datetime.datetime(2016, 1, 13, 2, 30), datetime.datetime(2016, 1, 13, 3, 30)), (datetime.datetime(2016, 1, 13, 13, 0), datetime.datetime(2016, 1, 13, 16, 0)), (datetime.datetime(2016, 1, 13, 14, 0), datetime.datetime(2016, 1, 13, 14, 0)), (datetime.datetime(2016, 1, 13, 20, 0), datetime.datetime(2016, 1, 13, 21, 0)), (datetime.datetime(2016, 1, 13, 21, 30), datetime.datetime(2016, 1, 13, 22, 0)), (datetime.datetime(2016, 1, 13, 22, 0), datetime.datetime(2016, 1, 13, 22, 0)), (datetime.datetime(2016, 1, 14, 2, 30), datetime.datetime(2016, 1, 14, 3, 30)), (datetime.datetime(2016, 1, 14, 13, 0), datetime.datetime(2016, 1, 14, 16, 0)), (datetime.datetime(2016, 1, 14, 14, 0), datetime.datetime(2016, 1, 14, 14, 0)), (datetime.datetime(2016, 1, 14, 22, 0), datetime.datetime(2016, 1, 14, 22, 0)), (datetime.datetime(2016, 1, 14, 22, 0), datetime.datetime(2016, 1, 14, 23, 0)), (datetime.datetime(2016, 1, 15, 2, 30), datetime.datetime(2016, 1, 15, 3, 30)), (datetime.datetime(2016, 1, 15, 13, 0), datetime.datetime(2016, 1, 15, 16, 0)), (datetime.datetime(2016, 1, 15, 14, 0), datetime.datetime(2016, 1, 15, 14, 0)), (datetime.datetime(2016, 1, 15, 20, 30), datetime.datetime(2016, 1, 15, 22, 0)), (datetime.datetime(2016, 1, 15, 22, 0), datetime.datetime(2016, 1, 15, 22, 0)), (datetime.datetime(2016, 1, 16, 2, 30), datetime.datetime(2016, 1, 16, 3, 30)), (datetime.datetime(2016, 1, 16, 13, 0), datetime.datetime(2016, 1, 16, 16, 0)), (datetime.datetime(2016, 1, 17, 2, 30), datetime.datetime(2016, 1, 17, 3, 30))]
这是我目前的代码:
import datetime
def merge_date_ranges(data):
result = []
for t1, t2 in ((data[i], data[i+1]) for i in range(len(data)-1)):
if t1[1] >= t2[0]:
result.append((min(t1[0], t2[0]), max(t1[1], t2[1])))
else:
result.append(t1)
如果T1(第一个日期时间范围)和T2(第二个日期时间范围)不重叠,那么我只需将T1添加到新列表(结果)。如果T1和T2 DO重叠,那么我将合并的元组添加到新列表(结果)。
我的问题是合并后会发生什么。例如:
T1 = (datetime.datetime(2016, 1, 10, 13, 0), datetime.datetime(2016, 1, 10, 16, 0))
T2 = (datetime.datetime(2016, 1, 10, 14, 0), datetime.datetime(2016, 1, 10, 14, 0))
合并T1和T2,并将以下内容添加到我的新列表中:
(datetime.datetime(2016, 1, 10, 13, 0), datetime.datetime(2016, 1, 10, 16, 0))
所以现在我希望我的代码(在for循环的下一次迭代中)将合并的元组(新T1)与列表中的下一个日期时间范围进行比较:
T1 = (datetime.datetime(2016, 1, 10, 13, 0), datetime.datetime(2016, 1, 10, 16, 0))
T2 = (datetime.datetime(2016, 1, 10, 22, 0), datetime.datetime(2016, 1, 10, 22, 0))
但相反,这就是T1和T2的样子:
T1 = (datetime.datetime(2016, 1, 10, 14, 0), datetime.datetime(2016, 1, 10, 14, 0))
T2 = (datetime.datetime(2016, 1, 10, 22, 0), datetime.datetime(2016, 1, 10, 22, 0))
T1被添加到我的新列表中(我不想要),因为它之前已经合并过。
但我无法理解如何做到这一点。如果我能够通过用合并的元组替换T2并删除T1来更新我的原始列表会更容易。但据我所知,这是不可能的,甚至是一个好的做法。
经过一周的脱发后,我在这里发布了第一个问题,希望有人可以帮助我恢复理智。 :)
更新 基本上我想最终得到一个没有日期时间范围重叠的新列表。
答案 0 :(得分:1)
我想这就是你想要的。 给它一个测试和评论:
def merge_date_ranges(data):
result = []
t_old = data[0]
for t in data[1:]:
if t_old[1] >= t[0]: #I assume that the data is sorted already
t_old = ((min(t_old[0], t[0]), max(t_old[1], t[1])))
else:
result.append(t_old)
t_old = t
else:
result.append(t_old)
return result
我假设日期已经排序。
顺便说一句,我看到唯一奇怪的日期是单日,也许你应该修改你的输入数据。salida = merge_date_ranges(data)
for item in [t for t in data if t not in salida]:
print item
(datetime.datetime(2016, 1, 10, 14, 0), datetime.datetime(2016, 1, 10, 14, 0))
(datetime.datetime(2016, 1, 11, 14, 0), datetime.datetime(2016, 1, 11, 14, 0))
(datetime.datetime(2016, 1, 12, 14, 0), datetime.datetime(2016, 1, 12, 14, 0))
(datetime.datetime(2016, 1, 13, 14, 0), datetime.datetime(2016, 1, 13, 14, 0))
(datetime.datetime(2016, 1, 13, 22, 0), datetime.datetime(2016, 1, 13, 22, 0))
(datetime.datetime(2016, 1, 14, 14, 0), datetime.datetime(2016, 1, 14, 14, 0))
(datetime.datetime(2016, 1, 14, 22, 0), datetime.datetime(2016, 1, 14, 22, 0))
(datetime.datetime(2016, 1, 15, 14, 0), datetime.datetime(2016, 1, 15, 14, 0))
(datetime.datetime(2016, 1, 15, 22, 0), datetime.datetime(2016, 1, 15, 22, 0))
答案 1 :(得分:0)
编辑:给定新信息,您仍然应该进行循环,但将其基于布尔值并检查重叠。
\]\s+(?:\d[.\d]*,\d+\s+)*(\d[.\d]*,\d+)