在Python Pandas中按小时计数对时间序列进行分组

时间:2019-05-27 10:30:19

标签: python data-analysis data-manipulation

df['data'] = df['data'].dropna()
df['data'] = df['data'].str.strip("'(), ")
df['data'] = pd.to_datetime(df['data'], format='%Y-%m-%d %H:%M:%S.%f')
df['data'] = df['data'].dropna()

这是我的数据集标题

0   2019-05-26 00:00:00.326000+00:00
1   2019-05-26 00:00:00.690000+00:00
2   2019-05-26 00:00:02.850000+00:00
3   2019-05-26 00:00:02.971000+00:00
4   2019-05-26 00:00:03.432000+00:00
Name: data, dtype: datetime64[ns, UTC]

我需要按小时数。 所需的输出


时间间隔:总计

0-1-5次 1-2:10次 .. .. 23-24:4次


df [data] .head()

1 个答案:

答案 0 :(得分:1)

使用pandas.Series.dt.hour

给出df

                     data
0 2019-05-26 01:00:00.326
1 2019-05-26 02:00:00.690
2 2019-05-26 02:00:02.850
3 2019-05-26 03:00:02.971
4 2019-05-26 05:00:03.432

df['data'].dt.hourpd.DataFrame.groupby.count一起使用:

import pandas as pd

df.groupby(df['data'].dt.hour).count()

输出:

      data
data      
1        1
2        2
3        1
5        1