Question

我有一个包含两列datetime.time项的数据框。

之类的东西

   col1                 col2
02:10:00.008209    02:08:38.053145
02:10:00.567054    02:08:38.053145
02:10:00.609842    02:08:38.053145
02:10:00.728153    02:08:38.053145
02:10:02.394408    02:08:38.053145

如何生成col3，即col1和col2之间的差异？（优选在几微秒内）？

我四处寻找，但我找不到解决方案。有谁知道吗？

谢谢！

Answer 1

请勿使用datetime.time，请使用timedelta：

import pandas as pd
import io
data = """col1                 col2
02:10:00.008209    02:08:38.053145
02:10:00.567054    02:08:38.053145
02:10:00.609842    02:08:38.053145
02:10:00.728153    02:08:38.053145
02:10:02.394408    02:08:38.053145"""
df = pd.read_table(io.BytesIO(data), delim_whitespace=True)
df2 = df.apply(pd.to_timedelta)
diff = df2.col1 - df2.col2

diff.astype("i8")/1e9

输出在几秒钟内不同：

0    81.955064
1    82.513909
2    82.556697
3    82.675008
4    84.341263
dtype: float64

将时间数据帧转换为timedelta数据帧：

df.applymap(time.isoformat).apply(pd.to_timedelta)

Answer 2

您确定需要datetime.time个对象的DataFrame吗？几乎没有一个操作可以方便地对这些人进行操作，尤其是在包装在DataFrame中时。

让每个列存储一个表示微秒总数的int可能会更好。

您可以将df转换为存储微秒的DataFrame，如下所示：

In [71]: df2 = df.applymap(lambda x: ((x.hour*60+x.minute)*60+x.second)*10**6+x.microsecond)

In [72]: df2
Out[72]: 
         col1        col2
0  7800008209  7718053145
1  7800567054  7718053145

从那里，很容易得到你想要的结果：

In [73]: df2['col1']-df2['col2']
Out[73]: 
0    81955064
1    82513909
dtype: int64

Answer 3

pandas将datetime个对象转换为np.datetime64个对象，其差异为np.timedelta64个对象。

考虑一下

In [30]: df
Out[30]: 
                       0                          1
0 2014-02-28 13:30:19.926778 2014-02-28 13:30:47.178474
1 2014-02-28 13:30:29.814575 2014-02-28 13:30:51.183349

我可以通过

来考虑列式差异

 df[0] - df[1]


 Out[31]: 
 0   -00:00:27.251696
 1   -00:00:21.368774
 dtype: timedelta64[ns]

因此我可以应用timedelta64次转化。微秒

(df[0] - df[1]).apply(lambda x : x.astype('timedelta64[us]')) #no actual difference when displayed

或微秒作为整数

(df[0] - df[1]).apply(lambda x : x.astype('timedelta64[us]').astype('int'))

 0   -27251696000
 1   -21368774000
 dtype: int64

修改正如@Jeff所建议的那样，最后的表达式可以缩短为

(df[0] - df[1]).astype('timedelta64[us]')

和

(df[0] - df[1]).astype('timedelta64[us]').astype('int')

for pandas＆gt; = .13。

python pandas datetime.time - datetime.time

3 个答案: