如何计算熊猫之间的时间差

时间:2019-10-07 18:51:43

标签: python pandas

我在熊猫中有以下数据框

fina_datetime

我要计算我正在熊猫中进行跟踪的end_time- df['end_time'] = df['srt_date'].map(str) +" "+ df['end_time'].map(str) df['end_time'] = pd.to_datetime(df['end_time'], format = "%Y-%m-%d %H:%M:%S") df['latency_in_secs'] = [x-y for x, y in zip(df['final_datetime'] , df['end_time'])] df['latency_in_secs'] = df.latency_in_secs.dt.total_seconds()

 code     srt_date       srt_time      end_time    fina_datetime        latency_in_secs 
 123      2019-01-01     23:23:00      00:12:00    2019-01-02 00:13:00     60 
 123      2019-01-02     00:13:00      00:14:00    2019-01-02 00:15:00     60
 123      2019-01-02     23:00:00      00.15:00    2019-01-03 00:16:00     60

当日期输入到下一个日期时,上述代码有问题,例如第一和第三行。如何在大熊猫中做到这一点?

我想要的数据框是

replace(Data, cbind(rep(1:NROW(Data), Data$index), sequence(Data$index)), 99)
#  COL1 COL2 COL3 index
#1   99   99    1     2
#2   99   99   99     3
#3   99   99    1     2
#4   99   99   99     3
#5   99   99    1     2
#6   99   99    1     2

1 个答案:

答案 0 :(得分:1)

IIUC,您可以掩盖end_time < srt_time的位置,并将日期加1:

# convert to timedelta
df['srt_time'] = pd.to_timedelta(df['srt_time'])
df['end_time'] = pd.to_timedelta(df['end_time'])

# convert to datetime
df['srt_date'] = pd.to_datetime(df['srt_date'])
df['fina_datetime'] = pd.to_datetime(df['fina_datetime'])

# the normal end
end_dates = df['srt_date'] + df['end_time']

# increase the end time with end_time < srt_time by one day
end_dates.loc[df['end_time'].le(df['srt_time'])] += pd.to_timedelta(1, unit='D')

# substract:
df['latency_in_secs'] = (df['fina_datetime'].sub(end_dates)
                             .dt.total_seconds()
                        )

输出:

   code   srt_date srt_time end_time       fina_datetime  latency_in_secs
0   123 2019-01-01 23:23:00 00:12:00 2019-01-02 00:13:00             60.0
1   123 2019-01-02 00:13:00 00:14:00 2019-01-02 00:15:00             60.0
2   123 2019-01-02 23:00:00 00:15:00 2019-01-03 00:16:00             60.0