让我们说"顺序"日期时间是彼此的特定时间间隔(即30分钟)内的日期时间;非连续日期时间是一个比彼此更长的时间段的日期时间。
给定由日期时间列表(作为字符串)组成的输入,我想得到连续日期时间列表的列表。我的解决方案如下,但我想知道是否有更好的方法:
list_of_datetime_strings: ['2016-02-26 10:30:00', '2016-02-26 11:00:00',
'2016-02-25 11:30:00', '2016-02-25 12:00:00', '2016-02-25 12:30:00',
'2016-02-26 12:30:00']
def find_datetime_sequences(list_of_datetime_strings, increment_in_minutes = 30):
if not list_of_datetime_strings:
return
str_to_datetime = lambda cur_datetime: datetime.strptime(cur_datetime, "%Y-%m-%d %H:%M:%S")
list__datetimes_sorted = sorted([str_to_datetime(cur_datetime) for cur_datetime in list_of_datetime_strings])
list_of_datetime_lists = [[list__datetimes_sorted[0]]]
for cur_datetime in list__datetimes_sorted[1:]:
time_difference = (cur_datetime - list_of_datetime_lists[-1][-1]).seconds / 60
if time_difference == increment_in_minutes:
list_of_datetime_lists[-1].append(cur_datetime)
else:
list_of_datetime_lists.append([cur_datetime])
return list_of_datetime_lists
find_datetime_sequences(list_of_datetime_strings)
输出:
list_of_datetime_lists: [[datetime.datetime(2016, 2, 25, 11, 30),
datetime.datetime(2016, 2, 25, 12, 0), datetime.datetime(2016, 2, 25, 12, 30)],
[datetime.datetime(2016, 2, 26, 10, 30), datetime.datetime(2016, 2, 26, 11, 0)],
[datetime.datetime(2016, 2, 26, 12, 30)]]
有没有更好的方法来实现上述目标?
答案 0 :(得分:2)
我没有更好的方法从字符串中制作datetime
个对象或对它们进行排序。但我认为通过使用生成器而不是常规函数可以改善其余部分(如果没有别的话可读性)。
def sequencify(sorted_datetimes, increment_in_minutes=30):
"""Take a sorted list of datetime objects. Yield sequences as lists."""
if not sorted_datetimes:
return
first, *rest = sorted_datetimes
# python 2: first, rest = sorted_datetimes[0], sorted_datetimes[1:]
sequence = [first]
delta = datetime.timedelta(minutes=increment_in_minutes)
while rest:
first, *rest = rest
if first - sequence[-1] > delta:
yield sequence
sequence = [first]
else:
sequence.append(first)
yield sequence
使用基于索引的方法的替代版本,类似于@SimeonVisser所做的:
def sequencify(sorted_datetimes, increment_in_minutes=30):
"""Take a sorted list of datetime objects. Yield sequences as lists."""
delta = datetime.timedelta(minutes=increment_in_minutes)
start = 0
for i in range(start, len(sorted_datetimes) - 1):
if sorted_datetimes[i+1] - sorted_datetimes[i] > delta:
yield sorted_datetimes[start:i+1]
start = i + 1
if sorted_datetimes:
yield sorted_datetimes[start:]
无论哪种方式,调用者都需要进行最小的更改:只需添加list()
:
strings = [
'2016-02-26 10:30:00',
'2016-02-26 11:00:00',
'2016-02-25 11:30:00',
'2016-02-25 12:00:00',
'2016-02-25 12:30:00',
'2016-02-26 12:30:00'
]
sorted_datetimes = sorted(datetime.datetime.strptime(s, '%Y-%m-%d %H:%M:%S')
for s in strings)
print(list(sequencify(sorted_datetimes))) # explicit conversion to list
输出:
[[datetime.datetime(2016, 2, 25, 11, 30),
datetime.datetime(2016, 2, 25, 12, 0),
datetime.datetime(2016, 2, 25, 12, 30)],
[datetime.datetime(2016, 2, 26, 10, 30),
datetime.datetime(2016, 2, 26, 11, 0)],
[datetime.datetime(2016, 2, 26, 12, 30)]]
答案 1 :(得分:1)
以下基本上是相同的方法,但可能更容易维护:
var seaweedfs = new weedClient({
server: "localhost",
port: 9333
});
这部分是为了检测两个日期之间的差异何时不是30分钟而我们需要在那里切割:
import datetime
strings = [
'2016-02-26 10:30:00',
'2016-02-26 11:00:00',
'2016-02-25 11:30:00',
'2016-02-25 12:00:00',
'2016-02-25 12:30:00',
'2016-02-26 12:30:00',
]
def find_datetime_sequences(strings, increment_in_minutes=30):
if not strings:
return
dates = sorted([
datetime.datetime.strptime(s, "%Y-%m-%d %H:%M:%S")
for s in strings
])
delta = datetime.timedelta(minutes=increment_in_minutes)
start = 0
n_items = len(dates)
cuts = []
for index in range(n_items):
next_index = index + 1
if next_index == n_items and start != next_index:
cuts.append((start, next_index))
elif dates[next_index] - dates[index] != delta:
cuts.append((start, next_index))
start = next_index
return [dates[i:j] for i, j in cuts]
这里的部分是为了确保,如果最后有一个日期时间需要进入一个自己的组,我们就这样做了:
elif dates[next_index] - dates[index] != delta:
cuts.append((start, next_index))
start = next_index