df['data'] = df['data'].dropna()
df['data'] = df['data'].str.strip("'(), ")
df['data'] = pd.to_datetime(df['data'], format='%Y-%m-%d %H:%M:%S.%f')
df['data'] = df['data'].dropna()
这是我的数据集标题
0 2019-05-26 00:00:00.326000+00:00
1 2019-05-26 00:00:00.690000+00:00
2 2019-05-26 00:00:02.850000+00:00
3 2019-05-26 00:00:02.971000+00:00
4 2019-05-26 00:00:03.432000+00:00
Name: data, dtype: datetime64[ns, UTC]
我需要按小时数。 所需的输出
0-1-5次 1-2:10次 .. .. 23-24:4次
df [data] .head()
答案 0 :(得分:1)
使用pandas.Series.dt.hour
。
给出df
:
data
0 2019-05-26 01:00:00.326
1 2019-05-26 02:00:00.690
2 2019-05-26 02:00:02.850
3 2019-05-26 03:00:02.971
4 2019-05-26 05:00:03.432
将df['data'].dt.hour
与pd.DataFrame.groupby.count
一起使用:
import pandas as pd
df.groupby(df['data'].dt.hour).count()
输出:
data
data
1 1
2 2
3 1
5 1