如何拆分数据框和组和?

时间:2019-02-14 19:45:10

标签: python pandas dataframe

每个IP地址都有6121行数据。 datetime对各种IP地址重复。我想按每个月对日期时间进行分组。

我尝试过的是

df.groupby(['ip_addr'],[pd.TimeGrouper('D')])。sum()

但是结果是:

所有ip_addr的日期时间no_of_queriers。

我要获取的列是

日期时间(以月为单位)no_of_queriers ip_addr。

请帮助我!

   ///              datetime  no_of_queriers       ip_addr
0     2014-02-16 00:00:00               0  1.204.33.193
1     2014-02-16 01:00:00               0  1.204.33.193
2     2014-02-16 02:00:00               0  1.204.33.193
3     2014-02-16 03:00:00               0  1.204.33.193
4     2014-02-16 04:00:00               0  1.204.33.193
5     2014-02-16 05:00:00               0  1.204.33.193
6     2014-02-16 06:00:00               0  1.204.33.193
7     2014-02-16 07:00:00               0  1.204.33.193
8     2014-02-16 08:00:00               0  1.204.33.193
9     2014-02-16 09:00:00               0  1.204.33.193
10    2014-02-16 10:00:00               0  1.204.33.193
11    2014-02-16 11:00:00               0  1.204.33.193
12    2014-02-16 12:00:00               0  1.204.33.193
13    2014-02-16 13:00:00               0  1.204.33.193
14    2014-02-16 14:00:00               0  1.204.33.193
15    2014-02-16 15:00:00               0  1.204.33.193
16    2014-02-16 16:00:00               0  1.204.33.193
17    2014-02-16 17:00:00               0  1.204.33.193
18    2014-02-16 18:00:00               0  1.204.33.193
19    2014-02-16 19:00:00               0  1.204.33.193
20    2014-02-16 20:00:00               0  1.204.33.193
21    2014-02-16 21:00:00               0  1.204.33.193
22    2014-02-16 22:00:00               0  1.204.33.193
23    2014-02-16 23:00:00               0  1.204.33.193
24    2014-02-17 00:00:00               0  1.204.33.193
25    2014-02-17 01:00:00               0  1.204.33.193
26    2014-02-17 02:00:00               0  1.204.33.193
27    2014-02-17 03:00:00               0  1.204.33.193
28    2014-02-17 04:00:00               0  1.204.33.193
29    2014-02-17 05:00:00               0  1.204.33.193
...                   ...             ...           ...
30575 2014-10-27 19:00:00               0   1.204.33.85
30576 2014-10-27 20:00:00               0   1.204.33.85
30577 2014-10-27 21:00:00               0   1.204.33.85
30578 2014-10-27 22:00:00               0   1.204.33.85
30579 2014-10-27 23:00:00               0   1.204.33.85
30580 2014-10-28 00:00:00               0   1.204.33.85
30581 2014-10-28 01:00:00               0   1.204.33.85
30582 2014-10-28 02:00:00               0   1.204.33.85
30583 2014-10-28 03:00:00               0   1.204.33.85
30584 2014-10-28 04:00:00               0   1.204.33.85
30585 2014-10-28 05:00:00               0   1.204.33.85
30586 2014-10-28 06:00:00               0   1.204.33.85
30587 2014-10-28 07:00:00               0   1.204.33.85
30588 2014-10-28 08:00:00               0   1.204.33.85
30589 2014-10-28 09:00:00               0   1.204.33.85
30590 2014-10-28 10:00:00               0   1.204.33.85
30591 2014-10-28 11:00:00               0   1.204.33.85
30592 2014-10-28 12:00:00               0   1.204.33.85
30593 2014-10-28 13:00:00               0   1.204.33.85
30594 2014-10-28 14:00:00               0   1.204.33.85
30595 2014-10-28 15:00:00               0   1.204.33.85
30596 2014-10-28 16:00:00               0   1.204.33.85
30597 2014-10-28 17:00:00               0   1.204.33.85
30598 2014-10-28 18:00:00               0   1.204.33.85
30599 2014-10-28 19:00:00               0   1.204.33.85
30600 2014-10-28 20:00:00               0   1.204.33.85
30601 2014-10-28 21:00:00               0   1.204.33.85
30602 2014-10-28 22:00:00               0   1.204.33.85
30603 2014-10-28 23:00:00               0   1.204.33.85
30604 2014-10-29 00:00:00               0   1.204.33.85

1 个答案:

答案 0 :(得分:1)

这就是您要寻找的东西

df.groupby(['ip_addr',pd.Grouper(key='datetime',freq='M')]).count()