目前,我有一个带有location_key和year_month multiIndex的df。我想使用滚动窗口创建一个3个月的款项。
(pd.DataFrame(df.groupby(['LOCATION_KEY','YEAR_MONTH'])['SALES'].count()).sort_index()).groupby(level=(0)).apply(lambda x: x.rolling(window=3).sum())
窗口运行正常,问题是在没有销售的月份中,而不是计算一个空的月份,而是计算另一个月份。
例如在下面的数据中,2016-03销售额是2016-03、2016-01、2015-12的总和,而不是我想要的:2016-03、2016-02、2016-01。
LOCATION_KE YEAR_MONTH SALES
A 2015-10 NaN
2015-11 NaN
2015-12 200
2016-01 220
2016-03 180
B 2015-04 NaN
2015-05 NaN
2015-06 119
2015-07 120
答案 0 :(得分:1)
基本上,您必须按照需要设置索引,以便滚动窗口中要处理的零。
df
LOCATION_KE YEAR_MONTH SALES
0 A 2015-10-01 NaN
1 A 2015-11-01 NaN
2 A 2015-12-01 200.0
3 A 2016-01-01 220.0
4 A 2016-03-01 180.0
5 B 2015-04-01 NaN
6 B 2015-05-01 NaN
7 B 2015-06-01 119.0
8 B 2015-07-01 120.0
df['SALES'] = df['SALES'].fillna(0)
df.index = [df["LOCATION_KE"], df["YEAR_MONTH"]]
df
LOCATION_KE YEAR_MONTH SALES
LOCATION_KE YEAR_MONTH
A 2015-10-01 A 2015-10-01 0.0
2015-11-01 A 2015-11-01 0.0
2015-12-01 A 2015-12-01 200.0
2016-01-01 A 2016-01-01 220.0
2016-03-01 A 2016-03-01 180.0
B 2015-04-01 B 2015-04-01 0.0
2015-05-01 B 2015-05-01 0.0
2015-06-01 B 2015-06-01 119.0
2015-07-01 B 2015-07-01 120.0
df = df.reindex(pd.MultiIndex.from_product([df['LOCATION_KE'],
pd.date_range("20150101", periods=24, freq='MS')],
names=['location', 'month']))
df['SALES'].fillna(0).reset_index(level=0).groupby('location').rolling(3).sum().fillna(0)
location SALES
location month
A 2015-01-01 A 0.0
2015-02-01 A 0.0
2015-03-01 A 0.0
2015-04-01 A 0.0
2015-05-01 A 0.0
2015-06-01 A 0.0
2015-07-01 A 0.0
2015-08-01 A 0.0
2015-09-01 A 0.0
2015-10-01 A 0.0
2015-11-01 A 0.0
2015-12-01 A 200.0
2016-01-01 A 420.0
2016-02-01 A 420.0
2016-03-01 A 400.0
2016-04-01 A 180.0
2016-05-01 A 180.0
2016-06-01 A 0.0
2016-07-01 A 0.0
2016-08-01 A 0.0
2016-09-01 A 0.0
2016-10-01 A 0.0
2016-11-01 A 0.0
2016-12-01 A 0.0
2015-01-01 A 0.0
2015-02-01 A 0.0
2015-03-01 A 0.0
2015-04-01 A 0.0
2015-05-01 A 0.0
2015-06-01 A 0.0
... ... ...
B 2016-07-01 B 0.0
2016-08-01 B 0.0
2016-09-01 B 0.0
2016-10-01 B 0.0
2016-11-01 B 0.0
2016-12-01 B 0.0
2015-01-01 B 0.0
2015-02-01 B 0.0
2015-03-01 B 0.0
2015-04-01 B 0.0
2015-05-01 B 0.0
2015-06-01 B 119.0
2015-07-01 B 239.0
2015-08-01 B 239.0
2015-09-01 B 120.0
2015-10-01 B 0.0
2015-11-01 B 0.0
2015-12-01 B 0.0
2016-01-01 B 0.0
2016-02-01 B 0.0
2016-03-01 B 0.0
2016-04-01 B 0.0
2016-05-01 B 0.0
2016-06-01 B 0.0
2016-07-01 B 0.0
2016-08-01 B 0.0
2016-09-01 B 0.0
2016-10-01 B 0.0
2016-11-01 B 0.0
2016-12-01 B 0.0
我认为,如果您拥有最新的熊猫,则可以省去reset_index。