有数据框df有两列,datetime.time对象如下:
TimeA TimeB
00:50:13 00:50:00
00:51:46 00:50:00
00:52:58 00:50:00
00:54:05 00:51:00
我想创建第三列,这两列之间存在差异。列中的元素是datetime.time对象。首先,我尝试了以下方法中的单个值:
>from datetime import datetime, date, time
>TimeA = datetime.combine(datetime.min, df.iloc[0,0]) - datetime.min
>TimeB = datetime.combine(datetime.min, df.iloc[0,1]) - datetime.min
> diff = TimeA - TimeB
它给出了以下结果:
datetime.timedelta(0, 13)
但是,当我尝试转换整列时:
df['TimeA_2'] = df['TimeA'].apply(lambda x : datetime.combine(date.min, x) - datetime.min)
发生以下错误:
combine() argument 2 must be datetime.time, not float
没有意义,因为当我检查两列中的元素类型时,它们是datetime.time。 我不知道,错误在哪里。非常感谢任何帮助。
答案 0 :(得分:1)
您可以将列转换为datetime
个对象,然后获取差异并转换为分钟:
import numpy as np
df[['TimeA', 'TimeB']] = df[['TimeA', 'TimeB']].apply(pd.to_datetime)
# TimeA TimeB
# 0 2018-03-05 00:50:13 2018-03-05 00:50:00
# 1 2018-03-05 00:51:46 2018-03-05 00:50:00
# 2 2018-03-05 00:52:58 2018-03-05 00:50:00
# 3 2018-03-05 00:54:05 2018-03-05 00:51:00
df['Diff'] = (df['TimeA'] - df['TimeB']) / np.timedelta64(1, 'm')
# TimeA TimeB Diff
# 0 2018-03-05 00:50:13 2018-03-05 00:50:00 0.216667
# 1 2018-03-05 00:51:46 2018-03-05 00:50:00 1.766667
# 2 2018-03-05 00:52:58 2018-03-05 00:50:00 2.966667
# 3 2018-03-05 00:54:05 2018-03-05 00:51:00 3.083333
答案 1 :(得分:1)
IIUC使用pd.to_timedelta
:
df[['TimeA','TimeB']] = df[['TimeA','TimeB']].apply(pd.to_timedelta)
df['Diff'] = (df['TimeA'] - df['TimeB'])
输出:
TimeA TimeB Diff
0 00:50:13 00:50:00 00:00:13
1 00:51:46 00:50:00 00:01:46
2 00:52:58 00:50:00 00:02:58
3 00:54:05 00:51:00 00:03:05
OR
df['Diff'] = (df['TimeA'] - df['TimeB']).dt.total_seconds() / 60
输出:
TimeA TimeB Diff
0 00:50:13 00:50:00 0.216667
1 00:51:46 00:50:00 1.766667
2 00:52:58 00:50:00 2.966667
3 00:54:05 00:51:00 3.083333