Question

我在python中有一个pandas数据帧，有几列和一个日期时间戳。其中一列有一个真/假变量。我希望计算时间，直到该列为假。

理想情况下，它看起来像这样：

datetime             delivered    secondsuntilfailure
2014-05-01 01:00:00    True       3
2014-05-01 01:00:01    True       2
2014-05-01 01:00:02    True       1
2014-05-01 01:00:03    False      0
2014-05-01 01:00:04    True       ?

提前致谢!!

Answer 1

您可以先按[::-1]更改订单，然后查看diff并计算cumsum，如果值为True：

df = df[::-1]
print (df.datetime.diff().astype('timedelta64[s]'))
4    NaN
3   -1.0
2   -1.0
1   -1.0
0   -1.0
Name: datetime, dtype: float64

df['new'] = df.delivered.where(~df.delivered,df.datetime.diff().astype('timedelta64[s]'))
              .cumsum().fillna(0).astype(int).mul(-1)
df = df[::-1]
print (df)
             datetime delivered secondsuntilfailure  new
0 2014-05-01 01:00:00      True                   3    3
1 2014-05-01 01:00:01      True                   2    2
2 2014-05-01 01:00:02      True                   1    1
3 2014-05-01 01:00:03     False                   0    0
4 2014-05-01 01:00:04      True                   ?    0

Answer 2

计算秒数：

cumsecs = df.datetime.diff().astype('timedelta64[s]').cumsum().fillna(value=0.0)

在交付失败时复制累计值，并填写任何前面的值：

undeliv_secs = cumsecs.where(~df['delivered']).fillna(method='bfill')

直到失败才是两者之间的差异：

df['secondsuntilfailure'] = undeliv_secs - cumsecs
print(df)
             datetime delivered  secondsuntilfailure
0 2014-05-01 01:00:00      True                  3.0
1 2014-05-01 01:00:01      True                  2.0
2 2014-05-01 01:00:02      True                  1.0
3 2014-05-01 01:00:03     False                  0.0
4 2014-05-01 01:00:04      True                  NaN

Python Pandas计算时间直到输出达到0

2 个答案: