Pandas:自定义时间范围内的出现次数和特定日期的最活跃时间

时间:2018-06-04 22:55:47

标签: python pandas time

我对Stack和pandas很新,找不到像这样的东西,看起来比典型的操作更复杂,我很想学习如何做到这一点:

The data set:

Day       Messages   Time 
Friday    spam       8:05 AM
Tuesday   eggs       8:45 AM
Friday    smapeggs   9:03 AM
Monday    eggseggs   1:05 PM
Tuesday   baz        8:33 AM
Monday    eggsspam   2:25 PM

期望输出1:

Time ranges  Number of Messages
8:00-9:00 AM       3
9:00-10:00 AM      1
1:00-2:30 PM       2

期望输出2:

Day         Most Active Time
Friday         8:00-10:00 AM 
Tuesday        8:00-9:00 AM 
Monday         1:00-2:30 AM

我们的想法是查看一般情况下哪些小时响应最快,哪些小时数对特定日期响应最快。提前谢谢!

1 个答案:

答案 0 :(得分:0)

尝试以下方法:

df_start=pd.DataFrame({ 'Day' :['Friday','Tuesday','Friday','Friday'],
                        'Message':['spam','msg','eggs','another'],
                        'Time':['8:05 AM','9:45 AM', '10:34 AM', '8:45 AM']})
df_start=df_start.set_index('Time')
df_start.index = pd.to_datetime(df_start.index)
df_result=df_start.resample("2h").count().loc[:,'Message']

对于第二个输出,请尝试使用以下内容:

df_result=df_start.groupby("Day").apply(lambda df_start: 
                 df_start.resample("2h").count())