Question

我想计算促销的总计，并在促销更改时重置运行总计。如何使用Python和Pandas实现这一目标？非常感谢！

    Id        Date  Promo  Running_Total
0   19  2015-07-09      0              0
1   18  2015-07-10      0              0
2   17  2015-07-11      0              0
3   16  2015-07-13      1              1
4   15  2015-07-14      1              2
5   14  2015-07-15      1              3
6   13  2015-07-16      1              4
7   12  2015-07-17      1              5
8   11  2015-07-18      0              0
9   10  2015-07-20      0              0
10   9  2015-07-21      0              0
11   8  2015-07-22      0              0
12   7  2015-07-23      0              0
13   6  2015-07-24      0              0
14   5  2015-07-25      0              0
15   4  2015-07-27      1              1
16   3  2015-07-28      1              2
17   2  2015-07-29      1              3
18   1  2015-07-30      1              4
19   0  2015-07-31      1              5

Answer 1

完全改变的解决方案：

列Promo的值已更改为2和3。

对于连续计数，所有值都使用ne（!=）与shift ed列进行比较，cumsum用于组。

然后使用此群组的groupby并按cumcount计算并添加1来计算1的数量：

df['Running_Total'] = (df.groupby(df['Promo'].ne(df['Promo'].shift()).cumsum())
                          .cumcount()
                          .add(1))
print (df)
    Id        Date  Promo  Running_Total
0   19  2015-07-09      0              1
1   18  2015-07-10      0              2
2   17  2015-07-11      0              3
3   16  2015-07-13      2              1
4   15  2015-07-14      2              2
5   14  2015-07-15      2              3
6   13  2015-07-16      1              1
7   12  2015-07-17      1              2
8   11  2015-07-18      0              1
9   10  2015-07-20      0              2
10   9  2015-07-21      0              3
11   8  2015-07-22      0              4
12   7  2015-07-23      3              1
13   6  2015-07-24      3              2
14   5  2015-07-25      3              3
15   4  2015-07-27      1              1
16   3  2015-07-28      1              2
17   2  2015-07-29      1              3
18   1  2015-07-30      1              4
19   0  2015-07-31      1              5

但是如果需要通过布尔掩码替换0列多列中的Promo行 - 比较df['Promo'].ne(0) - 它会多0行所有0行所有另一个1：

df['Running_Total'] = df.groupby(df['Promo'].ne(df['Promo'].shift()).cumsum())
                        .cumcount()
                        .add(1)
                        .mul(df['Promo'].ne(0))
  print (df)

0   19  2015-07-09      0              0
1   18  2015-07-10      0              0
2   17  2015-07-11      0              0
3   16  2015-07-13      2              1
4   15  2015-07-14      2              2
5   14  2015-07-15      2              3
6   13  2015-07-16      1              1
7   12  2015-07-17      1              2
8   11  2015-07-18      0              0
9   10  2015-07-20      0              0
10   9  2015-07-21      0              0
11   8  2015-07-22      0              0
12   7  2015-07-23      3              1
13   6  2015-07-24      3              2
14   5  2015-07-25      3              3
15   4  2015-07-27      1              1
16   3  2015-07-28      1              2
17   2  2015-07-29      1              3
18   1  2015-07-30      1              4
19   0  2015-07-31      1              5

<强>详细：

print (df['Promo'].ne(df['Promo'].shift()).cumsum())
0     1
1     1
2     1
3     2
4     2
5     2
6     3
7     3
8     4
9     4
10    4
11    4
12    5
13    5
14    5
15    6
16    6
17    6
18    6
19    6
Name: Promo, dtype: int32

如何计算运行总计并在Python更改值时重置？

1 个答案: