假设我有以下代码:
import pandas as pd
frame = pd.DataFrame(
{
"Date":["11/1/2017","11/2/2017","11/3/2017","11/4/2017",
"11/5/2017","11/6/2017","11/7/2017","11/8/2017","11/9/2017","11/10/2017","11/11/2017",
"11/12/2017"],
"Day":["Wed","Thr","Fri","Sat",
"Sun","Mon","Tue","Wed","Thr","Fri","Sat",
"Sun"],
"Sales":[5,5,10,5,
5,10,5,5,15,5,0,
5]
})
frame.reset_index()
print(frame)
print()
frame['RollingSum']=frame["Sales"].rolling(window=7, min_periods = 1, win_type='boxcar').sum()
print(frame)
打印以下输出:
Date Day Sales
0 11/1/2017 Wed 5
1 11/2/2017 Thr 5
2 11/3/2017 Fri 10
3 11/4/2017 Sat 5
4 11/5/2017 Sun 5
5 11/6/2017 Mon 10
6 11/7/2017 Tue 5
7 11/8/2017 Wed 5
8 11/9/2017 Thr 15
9 11/10/2017 Fri 5
10 11/11/2017 Sat 0
11 11/12/2017 Sun 5
Date Day Sales RollingSum
0 11/1/2017 Wed 5 5.0
1 11/2/2017 Thr 5 10.0
2 11/3/2017 Fri 10 20.0
3 11/4/2017 Sat 5 25.0
4 11/5/2017 Sun 5 30.0
5 11/6/2017 Mon 10 40.0
6 11/7/2017 Tue 5 45.0
7 11/8/2017 Wed 5 45.0
8 11/9/2017 Thr 15 55.0
9 11/10/2017 Fri 5 50.0
10 11/11/2017 Sat 0 45.0
11 11/12/2017 Sun 5 45.0
现在假设不是平坦的7天窗口,我希望左端点是最近的星期日(或数据集中的第一天)......我怎么能实现这一点?期望的输出:
Date Day Sales WTDSum
0 11/1/2017 Wed 5 5.0
1 11/2/2017 Thr 5 10.0
2 11/3/2017 Fri 10 20.0
3 11/4/2017 Sat 5 25.0
4 11/5/2017 Sun 5 5.0
5 11/6/2017 Mon 10 15.0
6 11/7/2017 Tue 5 20.0
7 11/8/2017 Wed 5 25.0
8 11/9/2017 Thr 15 40.0
9 11/10/2017 Fri 5 45.0
10 11/11/2017 Sat 0 45.0
11 11/12/2017 Sun 5 5.0
答案 0 :(得分:3)
我们在小组和小组计算中使用cumsum
两次
df['WTDSum']=df.groupby(df.Day.eq('Sun').cumsum()).Sales.cumsum()
df
Out[520]:
Date Day Sales WTDSum
0 11/1/2017 Wed 5 5
1 11/2/2017 Thr 5 10
2 11/3/2017 Fri 10 20
3 11/4/2017 Sat 5 25
4 11/5/2017 Sun 5 5
5 11/6/2017 Mon 10 15
6 11/7/2017 Tue 5 20
7 11/8/2017 Wed 5 25
8 11/9/2017 Thr 15 40
9 11/10/2017 Fri 5 45
10 11/11/2017 Sat 0 45
11 11/12/2017 Sun 5 5