我有一个python pandas数据框,其中包含2列:time1
和time2
:
time1 time2
13:00:07.294234 13:00:07.294234
14:00:07.294234 14:00:07.394234
15:00:07.294234 15:00:07.494234
16:00:07.294234 16:00:07.694234
如何生成第三列,其中包含time1
和time2
之间的微秒差异,如果可能,则为整数?
答案 0 :(得分:3)
如果您在实际日期前加上hese,则可以将它们转换为datetime64列:
In [11]: '2014-03-19 ' + df
Out[11]:
time1 time2
0 2014-03-19 13:00:07.294234 2014-03-19 13:00:07.294234
1 2014-03-19 14:00:07.294234 2014-03-19 14:00:07.394234
2 2014-03-19 15:00:07.294234 2014-03-19 15:00:07.494234
3 2014-03-19 16:00:07.294234 2014-03-19 16:00:07.694234
[4 rows x 2 columns]
In [12]: df = ('2014-03-19 ' + df).astype('datetime64[ns]')
Out[12]:
time1 time2
0 2014-03-19 20:00:07.294234 2014-03-19 20:00:07.294234
1 2014-03-19 21:00:07.294234 2014-03-19 21:00:07.394234
2 2014-03-19 22:00:07.294234 2014-03-19 22:00:07.494234
3 2014-03-19 23:00:07.294234 2014-03-19 23:00:07.694234
现在您可以减去这些列:
In [13]: delta = df['time2'] - df['time1']
In [14]: delta
Out[14]:
0 00:00:00
1 00:00:00.100000
2 00:00:00.200000
3 00:00:00.400000
dtype: timedelta64[ns]
要获得微秒数,只需将基础纳秒除以1000:
In [15]: t.astype(np.int64) / 10**3
Out[15]:
0 0
1 100000
2 200000
3 400000
dtype: int64
正如杰夫指出的那样,在numpy的最新版本中你可以除以1微秒:
In [16]: t / np.timedelta64(1,'us')
Out[16]:
0 0
1 100000
2 200000
3 400000
dtype: float64
答案 1 :(得分:0)
最简单的方法就是这样做:
(pd.to_datetime(df['time2']) - pd.to_datetime(df['time1'])) / np.timedelta64(1, 'us')
“
答案 2 :(得分:-1)
使用dateutil,您可以将时间戳列转换为'真实'时间戳:
df.time1 = df.time1.apply(dateutil.parser.parse)
df.time2 = df.time2.apply(dateutil.parser.parse)
之后你想要定义一个这样的新列:
df['delta'] = df.time2 - df.time1