Question

给定一列日期时间，一列值和一个限制，我必须计算超过限制的总时间。样本数据如下：假设限制为30

Time                    Value
2018-01-03 12:54:23     23
2018-01-03 12:58:46     31
2018-01-03 13:02:12     32
2018-01-03 13:04:13     24
2018-01-03 13:07:01     28

我的想法是首先使用shift函数来计算每个时间戳之间的时间差。然后使用for循环遍历该值。如果先前值和当前值均超出限制，则我们将总时间增加时差

temp["TimeDifference"] = (temp.Time -temp.Time.shift(1)).fillna(pd.Timedelta(seconds=0))

total_time = pd.Timedelta(seconds=0)

for i in range(1, temp.shape[0]):
    if (temp.loc[i - 1].Value > upper_limit) and (temp.loc[i].Value > upper_limit):
        total_time = total_time + temp.loc[i].TimeDifference

它可以工作...但是运行时间非常长，我知道此算法效率不高。有人可以给我建议吗？谢谢

Answer 1

您引用了shift函数，但随后没有使用它。相反，您对-1移位进行了硬编码，并编写了自己的循环以遍历行。

相反，请执行以下操作以使用内置的矢量化功能。

sum((temp.Time - temp.Time.shift(1)) if temp.Value > 30)

...但是我需要有人检查我的语法；这是紧急任务之间的最重要的编码。

计算一个值超过熊猫限制的总时间

1 个答案: