我想计算促销的总计,并在促销更改时重置运行总计。如何使用Python和Pandas实现这一目标? 非常感谢!
Id Date Promo Running_Total
0 19 2015-07-09 0 0
1 18 2015-07-10 0 0
2 17 2015-07-11 0 0
3 16 2015-07-13 1 1
4 15 2015-07-14 1 2
5 14 2015-07-15 1 3
6 13 2015-07-16 1 4
7 12 2015-07-17 1 5
8 11 2015-07-18 0 0
9 10 2015-07-20 0 0
10 9 2015-07-21 0 0
11 8 2015-07-22 0 0
12 7 2015-07-23 0 0
13 6 2015-07-24 0 0
14 5 2015-07-25 0 0
15 4 2015-07-27 1 1
16 3 2015-07-28 1 2
17 2 2015-07-29 1 3
18 1 2015-07-30 1 4
19 0 2015-07-31 1 5
答案 0 :(得分:1)
完全改变的解决方案:
列Promo
的值已更改为2
和3
。
对于连续计数,所有值都使用ne
(!=
)与shift
ed列进行比较,cumsum
用于组。
然后使用此群组的groupby
并按cumcount
计算并添加1
来计算1
的数量:
df['Running_Total'] = (df.groupby(df['Promo'].ne(df['Promo'].shift()).cumsum())
.cumcount()
.add(1))
print (df)
Id Date Promo Running_Total
0 19 2015-07-09 0 1
1 18 2015-07-10 0 2
2 17 2015-07-11 0 3
3 16 2015-07-13 2 1
4 15 2015-07-14 2 2
5 14 2015-07-15 2 3
6 13 2015-07-16 1 1
7 12 2015-07-17 1 2
8 11 2015-07-18 0 1
9 10 2015-07-20 0 2
10 9 2015-07-21 0 3
11 8 2015-07-22 0 4
12 7 2015-07-23 3 1
13 6 2015-07-24 3 2
14 5 2015-07-25 3 3
15 4 2015-07-27 1 1
16 3 2015-07-28 1 2
17 2 2015-07-29 1 3
18 1 2015-07-30 1 4
19 0 2015-07-31 1 5
但是如果需要通过布尔掩码替换0
列多列中的Promo
行 - 比较df['Promo'].ne(0)
- 它会多0
行所有0
行所有另一个1
:
df['Running_Total'] = df.groupby(df['Promo'].ne(df['Promo'].shift()).cumsum())
.cumcount()
.add(1)
.mul(df['Promo'].ne(0))
print (df)
0 19 2015-07-09 0 0
1 18 2015-07-10 0 0
2 17 2015-07-11 0 0
3 16 2015-07-13 2 1
4 15 2015-07-14 2 2
5 14 2015-07-15 2 3
6 13 2015-07-16 1 1
7 12 2015-07-17 1 2
8 11 2015-07-18 0 0
9 10 2015-07-20 0 0
10 9 2015-07-21 0 0
11 8 2015-07-22 0 0
12 7 2015-07-23 3 1
13 6 2015-07-24 3 2
14 5 2015-07-25 3 3
15 4 2015-07-27 1 1
16 3 2015-07-28 1 2
17 2 2015-07-29 1 3
18 1 2015-07-30 1 4
19 0 2015-07-31 1 5
<强>详细强>:
print (df['Promo'].ne(df['Promo'].shift()).cumsum())
0 1
1 1
2 1
3 2
4 2
5 2
6 3
7 3
8 4
9 4
10 4
11 4
12 5
13 5
14 5
15 6
16 6
17 6
18 6
19 6
Name: Promo, dtype: int32