熊猫:计算一列中的一些值

时间:2016-09-27 10:22:48

标签: python datetime pandas aggregate days

我有数据框,它是

的一部分
    ID,"url","app_name","used_at","active_seconds","device_connection","device_os","device_type","device_usage"     
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-05-01 09:29:11,13,3g,android,smartphone,home     
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-05-01 09:33:00,3,unknown,android,smartphone,home     
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-06-01 09:33:07,1,unknown,android,smartphone,home     
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-06-01 09:34:30,5,unknown,android,smartphone,home     
e990fae0f48b7daf52619b5ccbec61bc,"",Messaging,2015-06-01 09:36:22,133,3g,android,smartphone,home        
e990fae0f48b7daf52619b5ccbec61bc,"",Messaging,2015-05-02 09:38:40,5,3g,android,smartphone,home      
574c4969b017ae6481db9a7c77328bc3,"",Yandex.Navigator,2015-05-01 11:04:48,70,3g,ios,smartphone,home      
574c4969b017ae6481db9a7c77328bc3,"",VK Client,2015-6-01 12:02:27,248,3g,ios,smartphone,home     
574c4969b017ae6481db9a7c77328bc3,"",Viber,2015-07-01 12:06:35,7,3g,ios,smartphone,home      
574c4969b017ae6481db9a7c77328bc3,"",VK Client,2015-08-01 12:23:26,86,3g,ios,smartphone,home     
574c4969b017ae6481db9a7c77328bc3,"",Talking Angela,2015-08-02 12:24:52,0,3g,ios,smartphone,home     
574c4969b017ae6481db9a7c77328bc3,"",My Talking Angela,2015-08-03 12:24:52,167,3g,ios,smartphone,home        
574c4969b017ae6481db9a7c77328bc3,"",Talking Angela,2015-08-04 12:27:39,34,3g,ios,smartphone,home        

我需要计算每个月的天数到ID

如果我尝试df.groupby('ID')['used_at'].count()我获得了访问量,我该如何在days处理month

1 个答案:

答案 0 :(得分:2)

我认为您需要ID man sshd_configgroupbymonth并汇总day

df1 = df.used_at.groupby([df['ID'], df.used_at.dt.month,df.used_at.dt.day ]).size()

print (df1)
ID                                used_at  used_at
574c4969b017ae6481db9a7c77328bc3  5        1          1
                                  6        1          1
                                  7        1          1
                                  8        1          1
                                           2          1
                                           3          1
                                           4          1
e990fae0f48b7daf52619b5ccbec61bc  5        1          2
                                           2          1
                                  6        1          3
dtype: int64

size - 与yearmonthday相同:

df1 = df.used_at.groupby([df['ID'], df.used_at.dt.date]).size()

print (df1)
ID                                used_at   
574c4969b017ae6481db9a7c77328bc3  2015-05-01    1
                                  2015-06-01    1
                                  2015-07-01    1
                                  2015-08-01    1
                                  2015-08-02    1
                                  2015-08-03    1
                                  2015-08-04    1
e990fae0f48b7daf52619b5ccbec61bc  2015-05-01    2
                                  2015-05-02    1
                                  2015-06-01    3
dtype: int64

countsize之间的差异:

  

date计算NaN个值,size不计算。