保持hh:mm:ss从时间增量

时间:2015-12-30 12:18:21

标签: python pandas timedelta

我有一列timedeltas,其属性列为here。我希望我的pandas表中的输出来自:

1 day, 13:54:03.0456

为:

13:54:03

如何从此输出中删除日期?

2 个答案:

答案 0 :(得分:1)

您可以使用dt.seconds获取一天中的几秒钟,然后将其传递给pd.Timedelta

from pandas import Series, date_range
from datetime import timedelta
td = Series(date_range('20130101',periods=4)) - Series(date_range('20121201',periods=4))
td[2] += timedelta(minutes=5,seconds=3)

In [321]: td
Out[321]: 
0   31 days 00:00:00
1   31 days 00:00:00
2   31 days 00:05:03
3   31 days 00:00:00
dtype: timedelta64[ns]

In [322]: td.dt.seconds.apply(lambda x: pd.Timedelta(seconds=x))
Out[322]: 
0   00:00:00
1   00:00:00
2   00:05:03
3   00:00:00
dtype: timedelta64[ns]

答案 1 :(得分:1)

您可以从每个Timedelta中减去天数:

import numpy as np
import pandas as pd

df = pd.DataFrame({'A':pd.to_timedelta(np.linspace(0, 10**6, 10), unit='s')})
df.iloc[::3, 0] = pd.NaT
df['B'] = df['A'] - df['A'].values.astype('timedelta64[D]')
# truncate fractional seconds
df['truncated'] = df['B'].values.astype('timedelta64[s]')
# round to nearest second
df['rounded'] = np.asarray(np.round(df['B'].values / np.timedelta64(1, 's')), dtype='timedelta64[s]')
print(df)

产量

                        A               B  truncated  rounded
0                     NaT             NaT        NaT      NaT
1  1 days 06:51:51.111111 06:51:51.111111   06:51:51 06:51:51
2  2 days 13:43:42.222222 13:43:42.222222   13:43:42 13:43:42
3                     NaT             NaT        NaT      NaT
4  5 days 03:27:24.444444 03:27:24.444444   03:27:24 03:27:24
5  6 days 10:19:15.555556 10:19:15.555556   10:19:15 10:19:16
6                     NaT             NaT        NaT      NaT
7  9 days 00:02:57.777778 00:02:57.777778   00:02:57 00:02:58
8 10 days 06:54:48.888889 06:54:48.888889   06:54:48 06:54:49
9                     NaT             NaT        NaT      NaT

A显示原始的Timedelta。列B显示减去整天后的结果。 truncatedrounded列显示丢弃或舍入小数秒后的结果。

调用astype('timedelta64[D]')会将NumPy timedelta64s截断为整天。 同样,调用astype('timedelta64[s]')会将NumPy timedelta64s截断为整秒。有关datetime64 / timedelta64算法的更多信息,请参阅the NumPy docs

减去天数的另一种方法是使用:

df['B'] = df['A'] - pd.to_timedelta(df['A'].dt.days, unit='d')

但结果却变慢了:

In [72]: df = pd.DataFrame({'A':pd.to_timedelta(np.linspace(0, 10**6, 1000), unit='s')})

In [73]: %timeit df['A'] - df['A'].values.astype('timedelta64[D]')
1000 loops, best of 3: 729 µs per loop

In [74]: %timeit df['A'] - pd.to_timedelta(df['A'].dt.days, unit='d')
100 loops, best of 3: 12.6 ms per loop

舍入到最接近的秒的另一种方法是:

df['rounded'] = pd.to_timedelta(df['B'].dt.total_seconds().round(), unit='s')

但这又慢了:

In [104]: df = pd.DataFrame({'A':pd.to_timedelta(np.linspace(0, 10**6, 1000), unit='s')})

In [105]: df['B'] = df['A'] - df['A'].values.astype('timedelta64[D]')

In [106]: %timeit np.asarray(np.round(df['B'].values / np.timedelta64(1, 's')), dtype='timedelta64[s]')
10000 loops, best of 3: 27.7 µs per loop

In [107]: %timeit pd.to_timedelta(df['B'].dt.total_seconds().round(), unit='s')
100 loops, best of 3: 3.94 ms per loop