熊猫timedelta64 [ns]计算

时间:2019-08-29 10:27:41

标签: python pandas

我有以下DataFrame:

"super-rare": 0

我想对一年中不同星期或几天进行统计(例如,一年中前7天的平均时长)。我尝试过的是运行

>>> sampleDate = pd.date_range('2018-01-01', periods=10, freq='D 16H')
>>> duration = pd.TimedeltaIndex(data =['123 days 5 hours', '1 day 6 min',  '2 days', '23 hours 7 min','5 days 17 hours 3 min','18 min', '1 day 17 hours',  '22 day 2 min', '22 hours','15 min'])
>>> df = pd.DataFrame(data={'time': sampleDate, 'duration': duration})
>>> df = df.set_index('time').sort_index()
>>> df
                             duration
time                                 
2018-01-01 00:00:00 123 days 05:00:00
2018-01-02 16:00:00   1 days 00:06:00
2018-01-04 08:00:00   2 days 00:00:00
2018-01-06 00:00:00   0 days 23:07:00
2018-01-07 16:00:00   5 days 17:03:00
2018-01-09 08:00:00   0 days 00:18:00
2018-01-11 00:00:00   1 days 17:00:00
2018-01-12 16:00:00  22 days 00:02:00
2018-01-14 08:00:00   0 days 22:00:00
2018-01-16 00:00:00   0 days 00:15:00

要获取正确的列将需要什么?

1 个答案:

答案 0 :(得分:2)

您可以通过astype将持续时间转换为原始格式,计数mean并转换回来:

df['duration'] = df['duration'].astype(np.int64)
df = pd.to_timedelta(df.rolling('7d')['duration'].mean())
print (df)
time
2018-01-01 00:00:00   123 days 05:00:00
2018-01-02 16:00:00    62 days 02:33:00
2018-01-04 08:00:00    42 days 01:42:00
2018-01-06 00:00:00    31 days 19:03:15
2018-01-07 16:00:00    26 days 13:51:12
2018-01-09 08:00:00     1 days 22:30:48
2018-01-11 00:00:00     2 days 01:53:36
2018-01-12 16:00:00     6 days 01:54:00
2018-01-14 08:00:00     6 days 01:40:36
2018-01-16 00:00:00     4 days 22:19:00
Name: duration, dtype: timedelta64[ns]

或将时间增量转换为秒:

df['duration'] = df['duration'].dt.total_seconds()
df1 = df.rolling('7d')['duration'].mean()
print (df1)
time
2018-01-01 00:00:00    10645200.0
2018-01-02 16:00:00     5365980.0
2018-01-04 08:00:00     3634920.0
2018-01-06 00:00:00     2746995.0
2018-01-07 16:00:00     2296272.0
2018-01-09 08:00:00      167448.0
2018-01-11 00:00:00      179616.0
2018-01-12 16:00:00      525240.0
2018-01-14 08:00:00      524436.0
2018-01-16 00:00:00      425940.0
Name: duration, dtype: float64