两个领域的熊猫累积总和

时间:2017-06-18 15:58:50

标签: python-3.x pandas

似乎累积总和既是一个常见的问题,也是一个即使在阅读其他帖子之后也难以理解的问题。

这是我的情况: 我有这些数据:

User  | Timestamp   | Period  | Count
User1 | 2006-08-13  | Morning | 1
User1 | 2006-08-14  | Evening | 1
User1 | 2006-08-17  | Morning | 1
User1 | 2006-09-15  | Evening | 1
User2 | 2006-09-16  | Morning | 1
User2 | 2006-09-17  | Morning | 1

我想要相同的表,但每个用户+期间组合的累积计数。像这样:

User  | Timestamp   | Period  | Count | CCount
User1 | 2006-08-13  | Morning | 1     | 1
User1 | 2006-08-14  | Evening | 1     | 1
User1 | 2006-08-17  | Morning | 1     | 2
User1 | 2006-09-15  | Evening | 1     | 2
User2 | 2006-09-16  | Morning | 1     | 1
User2 | 2006-09-17  | Evening | 1     | 2

1 个答案:

答案 0 :(得分:1)

您可以在groupby对象上使用cumcount:

df['CCount'] = df.groupby(['User', 'Period']).cumcount() + 1

df
Out: 
    User   Timestamp   Period  Count  CCount
0  User1  2006-08-13  Morning      1       1
1  User1  2006-08-14  Evening      1       1
2  User1  2006-08-17  Morning      1       2
3  User1  2006-09-15  Evening      1       2
4  User2  2006-09-16  Morning      1       1
5  User2  2006-09-17  Morning      1       2