我在数据帧中有一个列,其中包含24小时格式的开始时间和结束时间,在使用to_datetime转换后我希望得到开始时间和结束时间之间的差值,但是如果开始时间是23:00且结束时间是00:00然后它给-1天,所以我想做00:00到24:00而不是delta。
我有
s1 = pd.Series(['02/18/2016', '23:00:00', '00:00:00'])
df = pd.DataFrame([list(s1)], columns = ["Date", "Start_Time", "End_Time"])
>>> df
Date Start_Time End_Time
02/18/2016 23:00:00 00:00:00
必需的输出
Date Start_Time End_Time Diff
02/18/2016 23:00:00 00:00:00 01:00:00
答案 0 :(得分:1)
查找End_Time
等于'00:00:00'
的所有行,并将其转换为+1 days
的Timedelta:
df['Diff'] = pd.to_timedelta((df['End_Time'] == '00:00:00').astype(int), unit='d')
# 0 1 days
# Name: End_Time, dtype: timedelta64[ns]
然后将Start_Time
和End_Time
转换为日期:
for col in ['Start_Time', 'End_Time']:
df[col] = pd.to_datetime(df['Date'] + ' ' + df[col])
如果End_Time
使用等于'00:00:00'
,则将df['End_Time'] += df['Diff']
增加1天:
Diff
现在您可以照常计算df['Diff'] = df['End_Time'] - df['Start_Time']
:
import numpy as np
import pandas as pd
df = pd.DataFrame([['02/18/2016', '23:00:00', '00:00:00']],
columns = ["Date", "Start_Time", "End_Time"])
df['Diff'] = pd.to_timedelta((df['End_Time'] == '00:00:00').astype(int), unit='d')
for col in ['Start_Time', 'End_Time']:
df[col] = pd.to_datetime(df['Date'] + ' ' + df[col])
df['End_Time'] += df['Diff']
df['Diff'] = df['End_Time'] - df['Start_Time']
print(df)
Date Start_Time End_Time Diff
0 02/18/2016 2016-02-18 23:00:00 2016-02-19 01:00:00
产量
target