我在python d1和d2中有两个日期时间对象。我想要他们之间的时间差。我想要一些比(d1-d2)更复杂的东西:我希望夜间的时间比白天的时间少于一个恒定的分数c,例如白天一个小时在白天只算半小时。
在python(pandas和/或numpy)中有一个简单的方法吗?
谢谢!
编辑:夜间时间是从晚上9点到早上7点。但理想情况下,我正在寻找一种解决方案,您可以在白天为任意时段选择任意权重
答案 0 :(得分:3)
这是一个解决方案。
它做了两件事,首先它计算了两个日期之间的完整天数,并且因为我们知道(好吧,我们可以近似)每天是24小时,所以重量“白天时间”和“夜间“(计算以小时为单位)。所以现在我们只需要计算剩余的不到24小时的间隔。这里的诀窍是“折叠”时间,以便“黎明”不在一天的中间,但是在0,所以我们只有一个“黄昏”的分隔符,所以我们只有三个案例,都是白天,两者都是夜间或后来的日期是夜间,而较早的日期是白天。
根据评论更新。
我的笔记本电脑上有100万个函数调用的运行时间为function
。
4.588s
答案 1 :(得分:3)
此解决方案计算加权的完整日期数,然后从第一个和最后一个日期中减去或添加任何残差。这并不能解释任何夏令时效应。
import pandas as pd
def timediff(t1, t2):
DAY_SECS = 24 * 60 * 60
DUSK = pd.Timedelta("21h")
# Dawn is chosen as 7 a.m.
FRAC_NIGHT = 10 / 24
FRAC_DAY = 14 / 24
DAY_WEIGHT = 1
NIGHT_WEIGHT = 0.5
full_days = ((t2.date() - t1.date()).days * DAY_SECS *
(FRAC_NIGHT * NIGHT_WEIGHT + FRAC_DAY * DAY_WEIGHT))
def time2dusk(t):
time = (pd.Timestamp(t.date()) + DUSK) - t
time = time.total_seconds()
wtime = (min(time * NIGHT_WEIGHT, 0) +
min(max(time, 0), FRAC_DAY * DAY_SECS) * DAY_WEIGHT +
max(time - DAY_SECS * FRAC_DAY, 0) * NIGHT_WEIGHT)
return wtime
t1time2dusk = time2dusk(t1)
t2time2dusk = time2dusk(t2)
return full_days + t1time2dusk - t2time2dusk
这提供了加权秒数的解决方案,但您可以转换为
后的方便times = [(pd.Timestamp("20170101T12:00:00"), pd.Timestamp("20170101T15:00:00")),
(pd.Timestamp("20170101T12:00:00"), pd.Timestamp("20170101T23:00:00")),
(pd.Timestamp("20170101T12:00:00"), pd.Timestamp("20170102T12:00:00")),
(pd.Timestamp("20170101T22:00:00"), pd.Timestamp("20170101T23:00:00")),
(pd.Timestamp("20170101T22:00:00"), pd.Timestamp("20170102T05:00:00")),
(pd.Timestamp("20170101T06:00:00"), pd.Timestamp("20170101T08:00:00"))]
exp_diff_hours = [3, 9 + 2*0.5, 9 + 10*0.5 + 5, 1*0.5, 7*0.5, 1 + 1*0.5]
for i, ts in enumerate(times):
t1, t2 = ts
print("\n")
print("Time1: %s" % t1)
print("Time2: %s" % t2)
print("Weighted Time2 - Time1: %s" % (timediff(t1, t2) / 3600))
print("Weighted Time2 - Time1 Expected: %s" % exp_diff_hours[i])
for i, ts in enumerate(times):
t2, t1 = ts
print("\n")
print("Time1: %s" % t1)
print("Time2: %s" % t2)
print("Weighted Time2 - Time1: %s" % (timediff(t1, t2) / 3600))
print("Weighted Time2 - Time1 Expected: %s" % -exp_diff_hours[i])
Time1: 2017-01-01 12:00:00
Time2: 2017-01-01 15:00:00
Weighted Time2 - Time1: 3.000000000000001
Weighted Time2 - Time1 Expected: 3
Time1: 2017-01-01 12:00:00
Time2: 2017-01-01 23:00:00
Weighted Time2 - Time1: 10.0
Weighted Time2 - Time1 Expected: 10.0
Time1: 2017-01-01 12:00:00
Time2: 2017-01-02 12:00:00
Weighted Time2 - Time1: 19.0
Weighted Time2 - Time1 Expected: 19.0
Time1: 2017-01-01 22:00:00
Time2: 2017-01-01 23:00:00
Weighted Time2 - Time1: 0.5
Weighted Time2 - Time1 Expected: 0.5
Time1: 2017-01-01 22:00:00
Time2: 2017-01-02 05:00:00
Weighted Time2 - Time1: 3.5
Weighted Time2 - Time1 Expected: 3.5
Time1: 2017-01-01 06:00:00
Time2: 2017-01-01 08:00:00
Weighted Time2 - Time1: 1.5
Weighted Time2 - Time1 Expected: 1.5
Time1: 2017-01-01 15:00:00
Time2: 2017-01-01 12:00:00
Weighted Time2 - Time1: -3.000000000000001
Weighted Time2 - Time1 Expected: -3
Time1: 2017-01-01 23:00:00
Time2: 2017-01-01 12:00:00
Weighted Time2 - Time1: -10.0
Weighted Time2 - Time1 Expected: -10.0
Time1: 2017-01-02 12:00:00
Time2: 2017-01-01 12:00:00
Weighted Time2 - Time1: -19.0
Weighted Time2 - Time1 Expected: -19.0
Time1: 2017-01-01 23:00:00
Time2: 2017-01-01 22:00:00
Weighted Time2 - Time1: -0.5
Weighted Time2 - Time1 Expected: -0.5
Time1: 2017-01-02 05:00:00
Time2: 2017-01-01 22:00:00
Weighted Time2 - Time1: -3.5
Weighted Time2 - Time1 Expected: -3.5
Time1: 2017-01-01 08:00:00
Time2: 2017-01-01 06:00:00
Weighted Time2 - Time1: -1.5
Weighted Time2 - Time1 Expected: -1.5
答案 2 :(得分:2)
以下是两种方法。我假设第二个会在较大的日期范围内更快(例如相隔5年)但事实证明第一个是:
方法1:循环分钟并更新加权时间δ
4.2 seconds
(5年dt范围内的笔记本电脑运行时间)
import datetime
def weighted_timedelta(start_dt, end_dt,
nights_start = datetime.time(21,0),
nights_end = datetime.time(7,0),
night_weight = 0.5):
# initialize counters
weighted_timedelta = 0
i = start_dt
# loop through minutes in datetime-range, updating weighted_timedelta
while i <= end_dt:
i += timedelta(minutes=1)
if i.time() >= nights_start or i.time() <= nights_end:
weighted_timedelta += night_weight
else:
weighted_timedelta += 1
return weighted_timedelta
方法2:使用date_range&amp; amp;创建Pandas系列权重np.where()
。
15 seconds
(5年dt范围内的笔记本电脑运行时间)
def weighted_timedelta(start_dt, end_dt,
nights_start = datetime.time(21,0),
nights_end = datetime.time(7,0),
night_weight = 0.5):
# convert dts to pandas date-range series, minute-resolution
dt_range = pd.date_range(start=start_dt, end=end_dt, freq='min')
# Assign 'weight' as -night_weight- or 1, for each minute, depeding on day/night
dt_weights = np.where((dt_range2.time >= nights_start) | # | is bitwise 'or' for arrays of booleans
(dt_range2.time <= nights_end),
night_weight, 1)
# return value as weighted minutes
return dt_weights.sum()
每个人的测试准确度均为:
d1 = datetime.datetime(2016,1,22,20,30)
d2 = datetime.datetime(2016,1,22,21,30)
weighted_timedelta(d1, d2)
45.0
答案 3 :(得分:1)
试试这段代码:
from pandas import date_range
from pandas import Series
from datetime import datetime
from datetime import time
from dateutil.relativedelta import relativedelta
# initial date
d1 = datetime(2017, 1, 1, 8, 0, 0)
d2 = d1 + relativedelta(days=10)
print d1, d1
ts = Series(1, date_range(d1, d2, freq='S'))
c1 = ts.index.time >= time(21, 0, 0)
c2 = ts.index.time < time(7, 0, 0)
ts[c1 | c2] = .5
ts.iloc[-1] = 0
print ts.sum() # result in seconds
def get_seconds(ti):
ts = Series(1, ti)
c1 = ts.index.time >= time(21, 0, 0)
c2 = ts.index.time < time(7, 0, 0)
ts[c1 | c2] = .5
ts.iloc[-1] = 0
return ts.sum() * ti.freq.delta.seconds
ti0 = date_range(d1, d2, freq='H', normalize=True)
ti1 = date_range(ti0[0], d1, freq='S')
ti2 = date_range(ti0[-1], d2, freq='S')
print get_seconds(ti0) - get_seconds(ti1) + get_seconds(ti2) # result in seconds
答案 4 :(得分:1)
一种解决方案,可让您根据需要定义各自的权重。
首先,帮助函数切割我们的日期时间间隔:
from datetime import date, time, datetime, timedelta
def slice_datetimes_interval(start, end):
"""
Slices the interval between the datetimes start and end.
If start and end are on different days:
start time -> midnight | number of full days | midnight -> end time
---------------------- ------------------- --------------------
^ ^ ^
day_part_1 full_days day_part_2
If start and end are on the same day:
start time -> end time
----------------------
^
day_part_1 full_days = 0
Returns full_days and the list of day_parts (as tuples of time objects).
"""
if start > end:
raise ValueError("Start time must be before end time")
# Number of full days between the end of start day and the beginning of end day
# If start and end are on the same day, it will be -1
full_days = (datetime.combine(end, time.min) -
datetime.combine(start, time.max)).days
if full_days >= 0:
day_parts = [(start.time(), time.max),
(time.min, end.time())]
else:
full_days = 0
day_parts = [(start.time(), end.time())]
return full_days, day_parts
计算给定时期和权重列表的加权持续时间的类:
class WeightedDuration:
def __init__(self, periods):
"""
periods is a list of tuples (start_time, end_time, weight)
where start_time and end_time are datetime.time objects.
For a period including midnight, like 22:00 -> 6:30,
we create two periods:
- midnight (start of day) -> 6:30,
- 22:00 -> midnight(end of day)
so periods will be:
[(time.min, time(6, 30), 0.5),
(time(22, 0), time.max, 0.5)]
"""
self.periods = periods
# We store the weighted duration of a whole day for later reuse
self.day_duration = self.time_interval_duration(time.min, time.max)
def time_interval_duration(self, start_time, end_time):
"""
Returns the weighted duration, in seconds, between the datetime.time objects
start_time and end_time - so, two times on the *same* day.
"""
dummy_date = date(2000, 1, 1)
# First, we calculate the total duration, *without weight*.
# time objects can't be substracted, so
# we turn them into datetimes on dummy_date
duration = (datetime.combine(dummy_date, end_time) -
datetime.combine(dummy_date, start_time)).total_seconds()
# Then, we calculate the reductions during all periods
# intersecting our interval
reductions = 0
for period in self.periods:
period_start, period_end, weight = period
if period_end < start_time or period_start > end_time:
# the period and our interval don't intersect
continue
# Intersection of the period and our interval
start = max(start_time, period_start)
end = min (end_time, period_end)
reductions += ((datetime.combine(dummy_date, end) -
datetime.combine(dummy_date, start)).total_seconds()
* (1 - weight))
# as time.max is midnight minus a µs, we round the result
return round(duration - reductions)
def duration(self, start, end):
"""
Returns the weighted duration, in seconds, between the datetime.datetime
objects start and end.
"""
full_days, day_parts = slice_datetimes_interval(start, end)
dur = full_days * self.day_duration
for day_part in day_parts:
dur += self.time_interval_duration(*day_part)
return dur
我们创建一个WeightedDuration实例,定义我们的句点及其权重。 我们可以拥有任意数量的句号,权重小于或大于1.
wd = WeightedDuration([(time.min, time(7, 0), 0.5), # from midnight to 7, 50%
(time(12, 0), time(13, 0), 0.75), # from 12 to 13, 75%
(time(21, 0), time.max, 0.5)]) # from 21 to midnight, 50%
让我们计算日期时间之间的加权持续时间:
# 1 hour at 50%, 1 at 100%: that should be 3600 + 1800 = 5400 s
print(wd.duration(datetime(2017, 1, 3, 6, 0), datetime(2017, 1, 3, 8)))
# 5400
# a few tests
intervals = [
(datetime(2017, 1, 3, 9, 0), datetime(2017, 1, 3, 10)), # 1 hour with weight 1
(datetime(2017, 1, 3, 23, 0), datetime(2017, 1, 4, 1)), # 2 hours, weight 0.5
(datetime(2017, 1, 3, 5, 0), datetime(2017, 1, 4, 5)), # 1 full day
(datetime(2017, 1, 3, 5, 0), datetime(2017, 1, 3, 23)), # same day
(datetime(2017, 1, 3, 5, 0), datetime(2017, 1, 4, 23)), # next day
(datetime(2017, 1, 3, 5, 0), datetime(2017, 1, 5, 23)), # 1 full day in between
]
for interval in intervals:
print(interval)
print(wd.duration(*interval))
# (datetime.datetime(2017, 1, 3, 9, 0), datetime.datetime(2017, 1, 3, 10, 0))
# 3600
# (datetime.datetime(2017, 1, 3, 23, 0), datetime.datetime(2017, 1, 4, 1, 0))
# 3600
# (datetime.datetime(2017, 1, 3, 5, 0), datetime.datetime(2017, 1, 4, 5, 0))
# 67500
# (datetime.datetime(2017, 1, 3, 5, 0), datetime.datetime(2017, 1, 3, 23, 0))
# 56700
# (datetime.datetime(2017, 1, 3, 5, 0), datetime.datetime(2017, 1, 4, 23, 0))
# 124200
# (datetime.datetime(2017, 1, 3, 5, 0), datetime.datetime(2017, 1, 5, 23, 0))
# 191700
答案 5 :(得分:0)
高级概念
import pandas as pd
import numpy as np
def weighted_delta(start, end, night_start=21, night_end=7):
start, end = end_points = pd.to_datetime([start, end])
rng = pd.date_range(start.date(), end.date() + pd.offsets.Day())
evening = rng + pd.Timedelta(night_start, 'h')
morning = rng + pd.Timedelta(night_end, 'h')
rng = evening.union(morning).union(end_points)
rng = np.clip(rng.values, start.value, end.value)
rng = np.unique(rng)
rng = pd.to_datetime(rng).sort_values()
diffs = np.diff(rng)
if night_end <= start.hour < night_begin:
diff_sum = pd.Timedelta(diffs[::2].sum() + diffs[1::2].sum() / 2)
else:
diff_sum = pd.Timedelta(diffs[::2].sum() / 2 + diffs[1::2].sum())
return diff_sum.total_seconds()
weighted_delta('2017-01-01', '2017-01-03')
136800.0