pandas:计算df列之间的时差

时间:2015-10-21 14:00:33

标签: pandas dataframe difference timedelta

我有两个带字符串值的df列:

df['starttime']                           df['endtime']

0            2015-10-06 18:35:33            0            2015-10-06 18:35:58
1     2015-10-08 17:51:21.999000            1            2015-10-08 17:52:10
2     2015-10-08 20:51:55.999000            2            2015-10-08 20:52:21
3     2015-10-05 15:16:49.999000            3            2015-10-05 15:17:00
4     2015-10-05 15:16:53.999000            4            2015-10-05 15:17:22
5     2015-10-05 15:17:11.999000            5     2015-10-05 15:17:23.999000

我想计算这两列之间的差异

这是我尝试但失败的原因:

(df['starttime']-df['endtime']).astype('timedelta64[h]'))

unsupported operand type(s) for -: 'str' and 'str'

我认为astype会将str转换为timedelta?

1 个答案:

答案 0 :(得分:3)

Convert the date strings to pandas.Timestamps

df['starttime'] = pd.to_datetime(df['starttime'])
df['endtime'] = pd.to_datetime(df['endtime'])

然后采取差异:

df['starttime']-df['endtime']
unsupported operand type(s) for -: 'str' and 'str'
当您尝试减去包含两个Series的字符串时,会出现

df['starttime']-df['endtime']

没有先将字符串转换为时间戳。