鉴于此数据框:
df = pd.DataFrame(pd.to_timedelta(['00:00:02','00:00:05','00:00:10','00:00:15','00:00:05']))
df.index = pd.to_datetime(['20/02/2017 12:42:10','20/02/2017 12:43:10','20/02/2017 12:45:10','20/02/2017 12:45:10','20/02/2017 12:45:10'])
df.columns = ['time']
df
Out[232]:
time
2017-02-20 12:42:10 00:00:02
2017-02-20 12:43:10 00:00:05
2017-02-20 12:45:10 00:00:10
2017-02-20 12:45:10 00:00:15
2017-02-20 12:45:10 00:00:05
我试图重新取样,获得每分钟的平均时间。 P.e它在总结它们时起作用:
df.resample('min').sum()
Out[245]:
time
2017-02-20 12:42:00 00:00:02
2017-02-20 12:43:00 00:00:05
2017-02-20 12:44:00 00:00:00
2017-02-20 12:45:00 00:00:30
有什么办法让这项工作意味着什么?
类似的东西:
df.resample('min').mean()
答案 0 :(得分:1)
您可以先将timedeltas转换为total_seconds
(浮动),resample
并使用fillna
。最后转换to_timedelta
:
df = pd.to_timedelta(df.time.dt.total_seconds().resample('min').mean().fillna(0), unit='s')
print (df)
2017-02-20 12:42:00 00:00:02
2017-02-20 12:43:00 00:00:05
2017-02-20 12:44:00 00:00:00
2017-02-20 12:45:00 00:00:10
Freq: T, Name: time, dtype: timedelta64[ns]
转换为nanoseconds
:
print (pd.Series(df.time.values.astype(np.int64), index=df.index))
2017-02-20 12:42:10 2000000000
2017-02-20 12:43:10 5000000000
2017-02-20 12:45:10 10000000000
2017-02-20 12:45:10 15000000000
2017-02-20 12:45:10 5000000000
dtype: int64
df = pd.to_timedelta(pd.Series(df.time.values.astype(np.int64), index=df.index)
.resample('min').mean().fillna(0))
print (df)
2017-02-20 12:42:00 00:00:02
2017-02-20 12:43:00 00:00:05
2017-02-20 12:44:00 00:00:00
2017-02-20 12:45:00 00:00:10
Freq: T, dtype: timedelta64[ns]