调整夜间时间的python日期时间差异

时间:2017-04-11 05:09:57

标签: python pandas datetime numpy

我在python d1和d2中有两个日期时间对象。我想要他们之间的时间差。我想要一些比(d1-d2)更复杂的东西:我希望夜间的时间比白天的时间少于一个恒定的分数c,例如白天一个小时在白天只算半小时。

在python(pandas和/或numpy)中有一个简单的方法吗?

谢谢!

编辑:夜间时间是从晚上9点到早上7点。但理想情况下,我正在寻找一种解决方案,您可以在白天为任意时段选择任意权重

6 个答案:

答案 0 :(得分:3)

这是一个解决方案。

它做了两件事,首先它计算了两个日期之间的完整天数,并且因为我们知道(好吧,我们可以近似)每天是24小时,所以重量“白天时间”和“夜间“(计算以小时为单位)。所以现在我们只需要计算剩余的不到24小时的间隔。这里的诀窍是“折叠”时间,以便“黎明”不在一天的中间,但是在0,所以我们只有一个“黄昏”的分隔符,所以我们只有三个案例,都是白天,两者都是夜间或后来的日期是夜间,而较早的日期是白天。

根据评论更新。

我的笔记本电脑上有100万个函数调用的运行时间为function

4.588s

答案 1 :(得分:3)

此解决方案计算加权的完整日期数,然后从第一个和最后一个日期中减去或添加任何残差。这并不能解释任何夏令时效应。

import pandas as pd


def timediff(t1, t2):

    DAY_SECS = 24 * 60 * 60
    DUSK = pd.Timedelta("21h")
    # Dawn is chosen as 7 a.m.
    FRAC_NIGHT = 10 / 24
    FRAC_DAY = 14 / 24
    DAY_WEIGHT = 1
    NIGHT_WEIGHT = 0.5

    full_days = ((t2.date() - t1.date()).days * DAY_SECS *
                 (FRAC_NIGHT * NIGHT_WEIGHT + FRAC_DAY * DAY_WEIGHT))

    def time2dusk(t):
        time = (pd.Timestamp(t.date()) + DUSK) - t
        time = time.total_seconds()
        wtime = (min(time * NIGHT_WEIGHT, 0) +
                 min(max(time, 0), FRAC_DAY * DAY_SECS) * DAY_WEIGHT +
                 max(time - DAY_SECS * FRAC_DAY, 0) * NIGHT_WEIGHT)
        return wtime

    t1time2dusk = time2dusk(t1)
    t2time2dusk = time2dusk(t2)
    return full_days + t1time2dusk - t2time2dusk

这提供了加权秒数的解决方案,但您可以转换为

后的方便
times = [(pd.Timestamp("20170101T12:00:00"), pd.Timestamp("20170101T15:00:00")),
         (pd.Timestamp("20170101T12:00:00"), pd.Timestamp("20170101T23:00:00")),
         (pd.Timestamp("20170101T12:00:00"), pd.Timestamp("20170102T12:00:00")),
         (pd.Timestamp("20170101T22:00:00"), pd.Timestamp("20170101T23:00:00")),
         (pd.Timestamp("20170101T22:00:00"), pd.Timestamp("20170102T05:00:00")),
         (pd.Timestamp("20170101T06:00:00"), pd.Timestamp("20170101T08:00:00"))]

exp_diff_hours = [3, 9 + 2*0.5, 9 + 10*0.5 + 5, 1*0.5, 7*0.5, 1 + 1*0.5]

for i, ts in enumerate(times):
    t1, t2 = ts
    print("\n")
    print("Time1: %s" % t1)
    print("Time2: %s" % t2)
    print("Weighted Time2 - Time1: %s" % (timediff(t1, t2) / 3600))
    print("Weighted Time2 - Time1 Expected: %s" % exp_diff_hours[i])

for i, ts in enumerate(times):
    t2, t1 = ts
    print("\n")
    print("Time1: %s" % t1)
    print("Time2: %s" % t2)
    print("Weighted Time2 - Time1: %s" % (timediff(t1, t2) / 3600))
    print("Weighted Time2 - Time1 Expected: %s" % -exp_diff_hours[i])

Time1: 2017-01-01 12:00:00
Time2: 2017-01-01 15:00:00
Weighted Time2 - Time1: 3.000000000000001
Weighted Time2 - Time1 Expected: 3


Time1: 2017-01-01 12:00:00
Time2: 2017-01-01 23:00:00
Weighted Time2 - Time1: 10.0
Weighted Time2 - Time1 Expected: 10.0


Time1: 2017-01-01 12:00:00
Time2: 2017-01-02 12:00:00
Weighted Time2 - Time1: 19.0
Weighted Time2 - Time1 Expected: 19.0


Time1: 2017-01-01 22:00:00
Time2: 2017-01-01 23:00:00
Weighted Time2 - Time1: 0.5
Weighted Time2 - Time1 Expected: 0.5


Time1: 2017-01-01 22:00:00
Time2: 2017-01-02 05:00:00
Weighted Time2 - Time1: 3.5
Weighted Time2 - Time1 Expected: 3.5


Time1: 2017-01-01 06:00:00
Time2: 2017-01-01 08:00:00
Weighted Time2 - Time1: 1.5
Weighted Time2 - Time1 Expected: 1.5


Time1: 2017-01-01 15:00:00
Time2: 2017-01-01 12:00:00
Weighted Time2 - Time1: -3.000000000000001
Weighted Time2 - Time1 Expected: -3


Time1: 2017-01-01 23:00:00
Time2: 2017-01-01 12:00:00
Weighted Time2 - Time1: -10.0
Weighted Time2 - Time1 Expected: -10.0


Time1: 2017-01-02 12:00:00
Time2: 2017-01-01 12:00:00
Weighted Time2 - Time1: -19.0
Weighted Time2 - Time1 Expected: -19.0


Time1: 2017-01-01 23:00:00
Time2: 2017-01-01 22:00:00
Weighted Time2 - Time1: -0.5
Weighted Time2 - Time1 Expected: -0.5


Time1: 2017-01-02 05:00:00
Time2: 2017-01-01 22:00:00
Weighted Time2 - Time1: -3.5
Weighted Time2 - Time1 Expected: -3.5


Time1: 2017-01-01 08:00:00
Time2: 2017-01-01 06:00:00
Weighted Time2 - Time1: -1.5
Weighted Time2 - Time1 Expected: -1.5

答案 2 :(得分:2)

以下是两种方法。我假设第二个会在较大的日期范围内更快(例如相隔5年)但事实证明第一个是:

  1. 遍历日期时间之间的所有分钟
  2. 创建一个日期范围系列,然后是一系列权重(使用np.where()条件逻辑)并将它们相加
  3. 方法1:循环分钟并更新加权时间δ 4.2 seconds(5年dt范围内的笔记本电脑运行时间)

    import datetime    
    def weighted_timedelta(start_dt, end_dt,
                           nights_start = datetime.time(21,0),
                           nights_end   = datetime.time(7,0),
                           night_weight = 0.5):
    
        # initialize counters
        weighted_timedelta = 0
        i = start_dt
    
        # loop through minutes in datetime-range, updating weighted_timedelta
        while i <= end_dt:
            i += timedelta(minutes=1)
    
            if i.time() >= nights_start or i.time() <= nights_end:
                weighted_timedelta += night_weight
            else:
                weighted_timedelta += 1
    
        return weighted_timedelta
    

    方法2:使用date_range&amp; amp;创建Pandas系列权重np.where()
    15 seconds(5年dt范围内的笔记本电脑运行时间)

    def weighted_timedelta(start_dt, end_dt,
                           nights_start = datetime.time(21,0),
                           nights_end   = datetime.time(7,0),
                           night_weight = 0.5):
    
        # convert dts to pandas date-range series, minute-resolution
        dt_range = pd.date_range(start=start_dt, end=end_dt, freq='min')
    
        # Assign 'weight' as -night_weight- or 1, for each minute, depeding on day/night
        dt_weights = np.where((dt_range2.time >= nights_start) |  # | is bitwise 'or' for arrays of booleans
                              (dt_range2.time <= nights_end), 
                              night_weight, 1)
    
        # return value as weighted minutes
        return dt_weights.sum()
    

    每个人的测试准确度均为:

    d1 = datetime.datetime(2016,1,22,20,30)
    d2 = datetime.datetime(2016,1,22,21,30)
    
    weighted_timedelta(d1, d2)
    45.0
    

答案 3 :(得分:1)

试试这段代码:

from pandas import date_range
from pandas import Series
from datetime import datetime
from datetime import time
from dateutil.relativedelta import relativedelta

# initial date
d1 = datetime(2017, 1, 1, 8, 0, 0)
d2 = d1 + relativedelta(days=10)
print d1, d1

方法1:缓慢但易于理解。

ts = Series(1, date_range(d1, d2, freq='S'))
c1 = ts.index.time >= time(21, 0, 0)
c2 = ts.index.time < time(7, 0, 0)
ts[c1 | c2] = .5
ts.iloc[-1] = 0
print ts.sum()   # result in seconds

方法2:更快,但有点复杂

def get_seconds(ti):
    ts = Series(1, ti)
    c1 = ts.index.time >= time(21, 0, 0)
    c2 = ts.index.time < time(7, 0, 0)
    ts[c1 | c2] = .5
    ts.iloc[-1] = 0
    return ts.sum() * ti.freq.delta.seconds

ti0 = date_range(d1, d2, freq='H', normalize=True)
ti1 = date_range(ti0[0], d1, freq='S')
ti2 = date_range(ti0[-1], d2, freq='S')
print get_seconds(ti0) - get_seconds(ti1) + get_seconds(ti2) # result in seconds

答案 4 :(得分:1)

一种解决方案,可让您根据需要定义各自的权重。

首先,帮助函数切割我们的日期时间间隔:

from datetime import date, time, datetime, timedelta

def slice_datetimes_interval(start, end):
    """
    Slices the interval between the datetimes start and end.

    If start and end are on different days:
    start time -> midnight | number of full days | midnight -> end time
    ----------------------   -------------------   --------------------
               ^                     ^                      ^
          day_part_1             full_days              day_part_2

    If start and end are on the same day:
    start time -> end time
    ----------------------
              ^
         day_part_1              full_days = 0

    Returns full_days and the list of day_parts (as tuples of time objects).
    """

    if start > end:
        raise ValueError("Start time must be before end time")

    # Number of full days between the end of start day and the beginning of end day
    # If start and end are on the same day, it will be -1
    full_days = (datetime.combine(end, time.min) - 
                 datetime.combine(start, time.max)).days
    if full_days >= 0:
        day_parts = [(start.time(), time.max),
                     (time.min, end.time())]
    else:
        full_days = 0
        day_parts = [(start.time(), end.time())]

    return full_days, day_parts

计算给定时期和权重列表的加权持续时间的类:

class WeightedDuration:
    def __init__(self, periods):
        """
        periods is a list of tuples (start_time, end_time, weight)
        where start_time and end_time are datetime.time objects.

        For a period including midnight, like 22:00 -> 6:30,
        we create two periods:
          - midnight (start of day) -> 6:30,
          - 22:00 -> midnight(end of day)

        so periods will be:
          [(time.min, time(6, 30), 0.5),
           (time(22, 0), time.max, 0.5)]

        """
        self.periods = periods
        # We store the weighted duration of a whole day for later reuse
        self.day_duration = self.time_interval_duration(time.min, time.max)

    def time_interval_duration(self, start_time, end_time):
        """ 
        Returns the weighted duration, in seconds, between the datetime.time objects
        start_time and end_time - so, two times on the *same* day.
        """
        dummy_date = date(2000, 1, 1)

        # First, we calculate the total duration, *without weight*.
        # time objects can't be substracted, so
        # we turn them into datetimes on dummy_date
        duration = (datetime.combine(dummy_date, end_time) -
                    datetime.combine(dummy_date, start_time)).total_seconds()

        # Then, we calculate the reductions during all periods
        # intersecting our interval
        reductions = 0
        for period in self.periods:
            period_start, period_end, weight = period
            if period_end < start_time or period_start > end_time:
                # the period and our interval don't intersect
                continue

            # Intersection of the period and our interval
            start = max(start_time, period_start)
            end = min (end_time, period_end)

            reductions += ((datetime.combine(dummy_date, end) -
                           datetime.combine(dummy_date, start)).total_seconds()
                           * (1 - weight))
        # as time.max is midnight minus a µs, we round the result
        return round(duration - reductions)

    def duration(self, start, end):
        """
        Returns the weighted duration, in seconds, between the datetime.datetime
        objects start and end.
        """
        full_days, day_parts = slice_datetimes_interval(start, end)
        dur = full_days * self.day_duration
        for day_part in day_parts:
            dur += self.time_interval_duration(*day_part)
        return dur

我们创建一个WeightedDuration实例,定义我们的句点及其权重。 我们可以拥有任意数量的句号,权重小于或大于1.

wd = WeightedDuration([(time.min, time(7, 0), 0.5),      # from midnight to 7, 50%
                       (time(12, 0), time(13, 0), 0.75), # from 12 to 13, 75%
                       (time(21, 0), time.max, 0.5)])    # from 21 to midnight, 50%

让我们计算日期时间之间的加权持续时间:

# 1 hour at 50%, 1 at 100%: that should be 3600 + 1800 = 5400 s
print(wd.duration(datetime(2017, 1, 3, 6, 0), datetime(2017, 1, 3, 8)))
# 5400

# a few tests
intervals = [
    (datetime(2017, 1, 3, 9, 0), datetime(2017, 1, 3, 10)),  # 1 hour with weight 1
    (datetime(2017, 1, 3, 23, 0), datetime(2017, 1, 4, 1)),  # 2 hours, weight 0.5
    (datetime(2017, 1, 3, 5, 0), datetime(2017, 1, 4, 5)),   # 1 full day
    (datetime(2017, 1, 3, 5, 0), datetime(2017, 1, 3, 23)),  # same day
    (datetime(2017, 1, 3, 5, 0), datetime(2017, 1, 4, 23)),  # next day
    (datetime(2017, 1, 3, 5, 0), datetime(2017, 1, 5, 23)),  # 1 full day in between
            ]
for interval in intervals:
    print(interval)
    print(wd.duration(*interval))  

# (datetime.datetime(2017, 1, 3, 9, 0), datetime.datetime(2017, 1, 3, 10, 0))
# 3600
# (datetime.datetime(2017, 1, 3, 23, 0), datetime.datetime(2017, 1, 4, 1, 0))
# 3600
# (datetime.datetime(2017, 1, 3, 5, 0), datetime.datetime(2017, 1, 4, 5, 0))
# 67500
# (datetime.datetime(2017, 1, 3, 5, 0), datetime.datetime(2017, 1, 3, 23, 0))
# 56700
# (datetime.datetime(2017, 1, 3, 5, 0), datetime.datetime(2017, 1, 4, 23, 0))
# 124200
# (datetime.datetime(2017, 1, 3, 5, 0), datetime.datetime(2017, 1, 5, 23, 0))
# 191700

答案 5 :(得分:0)

高级概念

  • 采取开始和结束时间戳。
  • 查找他们之间的所有7点和9点的实例
  • 创建一个排序的时间戳数组,包括开始,结束,所有7个ams,所有9个pms
  • 计算此阵列上的差异
  • 确定起点是白天还是晚上
  • 总结将适当的一半除以2
  • 的差异
import pandas as pd
import numpy as np

def weighted_delta(start, end, night_start=21, night_end=7):
    start, end = end_points = pd.to_datetime([start, end])
    rng = pd.date_range(start.date(), end.date() + pd.offsets.Day())
    evening = rng + pd.Timedelta(night_start, 'h')
    morning = rng + pd.Timedelta(night_end, 'h')
    rng = evening.union(morning).union(end_points)
    rng = np.clip(rng.values, start.value, end.value)
    rng = np.unique(rng)
    rng = pd.to_datetime(rng).sort_values()
    diffs = np.diff(rng)
    if night_end <= start.hour < night_begin:
        diff_sum = pd.Timedelta(diffs[::2].sum() + diffs[1::2].sum() / 2)
    else:
        diff_sum = pd.Timedelta(diffs[::2].sum() / 2 + diffs[1::2].sum())
    return diff_sum.total_seconds() 

weighted_delta('2017-01-01', '2017-01-03')

136800.0