我有一系列时间戳,包括开始时间和结束时间。我想生成两个时间戳之间每小时的分钟数:
import pandas as pd
start_time = pd.to_datetime('2013-03-26 21:49:08',infer_datetime_format=True)
end_time = pd.to_datetime('2013-03-27 05:21:00, infer_datetime_format=True)
pd.date_range(start_time, end_time, freq='h')
给出:
DatetimeIndex(['2013-03-26 21:49:08', '2013-03-26 22:49:08',
'2013-03-26 23:49:08', '2013-03-27 00:49:08',
'2013-03-27 01:49:08', '2013-03-27 02:49:08',
'2013-03-27 03:49:08', '2013-03-27 04:49:08'],
dtype='datetime64[ns]', freq='H')
示例结果:我想计算开始时间和结束时间之间的小时数,如下所示:
2013-03-26 21:00:00' - 10m 52secs
2013-03-26 22:00:00' - 60 m
2013-03-26 23:00:00' - 60 m
2013-03-27 05:00:00' - 21 m
我看过pandas resample,但不确定如何实现这一点。任何方向都表示赞赏。
答案 0 :(得分:2)
构造对应于每小时的开始和结束时间的两个Series
。使用clip_lower
和clip_upper
将其限制在所需的时间范围内,然后减去:
# hourly range, floored to the nearest hour
rng = pd.date_range(start_time.floor('h'), end_time.floor('h'), freq='h')
# get the left and right endpoints for each hour
# clipped to be inclusive of [start_time, end_time]
left = pd.Series(rng, index=rng).clip_lower(start_time)
right = pd.Series(rng + 1, index=rng).clip_upper(end_time)
# construct a series of the lengths
s = right - left
结果输出:
2013-03-26 21:00:00 00:10:52
2013-03-26 22:00:00 01:00:00
2013-03-26 23:00:00 01:00:00
2013-03-27 00:00:00 01:00:00
2013-03-27 01:00:00 01:00:00
2013-03-27 02:00:00 01:00:00
2013-03-27 03:00:00 01:00:00
2013-03-27 04:00:00 01:00:00
2013-03-27 05:00:00 00:21:00
Freq: H, dtype: timedelta64[ns]
答案 1 :(得分:0)
在某种for循环中使用datetime.timedelta()似乎就是你正在寻找的东西。
https://docs.python.org/2/library/datetime.html#datetime.timedelta
答案 2 :(得分:0)
看起来这可能是一个可行的解决方案:
import pandas as pd
import datetime as dt
def bounded_min(t, range_time):
""" For a given timestamp t and considered time interval range_time,
return the desired bounded value in minutes and seconds"""
# min() takes care of the end of the time interval,
# max() takes care of the beginning of the interval
s = (min(t + dt.timedelta(hours=1), range_time.max()) -
max(t, range_time.min())).total_seconds()
if s%60:
return "%dm %dsecs" % (s/60, s%60)
else:
return "%dm" % (s/60)
start_time = pd.to_datetime('2013-03-26 21:49:08',infer_datetime_format=True)
end_time = pd.to_datetime('2013-03-27 05:21:00', infer_datetime_format=True)
range_time = pd.date_range(start_time, end_time, freq='h')
# Include the end of the time range using the union() trick, as described at:
# https://stackoverflow.com/questions/37890391/how-to-include-end-date-in-pandas-date-range-method
range_time = range_time.union([end_time])
# This is essentially timestamps for beginnings of hours
index_time = pd.Series(range_time).apply(lambda x: dt.datetime(year=x.year,
month=x.month,
day=x.day,
hour=x.hour,
minute=0,
second=0))
bounded_mins = index_time.apply(lambda x: bounded_min(x, range_time))
# Put timestamps and values together
bounded_df = pd.DataFrame(bounded_mins, columns=["Bounded Mins"]).set_index(index_time)
print bounded_df
喜欢强大的lambdas :)。也许有一种更简单的方法可以做到这一点。
输出:
Bounded Mins
2013-03-26 21:00:00 10m 52secs
2013-03-26 22:00:00 60m
2013-03-26 23:00:00 60m
2013-03-27 00:00:00 60m
2013-03-27 01:00:00 60m
2013-03-27 02:00:00 60m
2013-03-27 03:00:00 60m
2013-03-27 04:00:00 60m
2013-03-27 05:00:00 21m