我正在尝试计算 Products 列的连续出现次数。结果应如“总计数”列中所示。我尝试将 groupby 与 cumsum 一起使用,但我的逻辑无法正常工作
+----------+--------------+
| Products | Total counts |
+----------+--------------+
| 1 | 3 |
+----------+--------------+
| 1 | 3 |
+----------+--------------+
| 1 | 3 |
+----------+--------------+
| 2 | 1 |
+----------+--------------+
| 3 | 3 |
+----------+--------------+
| 3 | 3 |
+----------+--------------+
| 3 | 3 |
+----------+--------------+
| 4 | 2 |
+----------+--------------+
| 4 | 2 |
+----------+--------------+
答案 0 :(得分:1)
使用 groupby
和 transform
并计数,
df['Total counts'] = df.groupby('Products').transform('count')
输出:
Products Total counts
0 1 3
1 1 3
2 1 3
3 2 1
4 3 3
5 3 3
6 3 3
7 4 2
8 4 2
连续产品,稍后在数据框中重复:
grp = (df['Products'] != df['Products'].shift()).cumsum()
df['Total counts'] = df.groupby(grp)['Products'].transform('count')
输出:
Products Total counts
0 1 3
1 1 3
2 1 3
3 2 1
4 3 3
5 3 3
6 3 3
7 4 2
8 4 2