我有一个数据集如下:
ts
Out[227]:
Sales
Month
Jan 1808
Feb 1251
Mar 3023
Apr 4857
May 2506
Jun 2453
Jul 1180
Aug 4239
Sep 1759
Oct 2539
Nov 3923
Dec 2999
取窗口移动平均值= 2后,输出为:
shifted = ts.shift(0)
window = shifted.rolling(window=2)
means = window.mean()
print(means)
Sales
Month
Jan NaN
Feb 1529.5
Mar 2137.0
Apr 3940.0
May 3681.5
Jun 2479.5
Jul 1816.5
Aug 2709.5
Sep 2999.0
Oct 2149.0
Nov 3231.0
Dec 3460.5
我希望NaN被其原始值替换。可以吗?
答案 0 :(得分:5)
试试这个:
In [92]: ts.rolling(window=2, min_periods=1).mean()
Out[92]:
Sales
Jan 1808.0
Feb 1529.5
Mar 2137.0
Apr 3940.0
May 3681.5
Jun 2479.5
Jul 1816.5
Aug 2709.5
Sep 2999.0
Oct 2149.0
Nov 3231.0
Dec 3461.0
答案 1 :(得分:4)
使用:
df = df['Sales'].rolling(window=2).mean().fillna(df['Sales'])
print (df)
Jan 1808.0
Feb 1529.5
Mar 2137.0
Apr 3940.0
May 3681.5
Jun 2479.5
Jul 1816.5
Aug 2709.5
Sep 2999.0
Oct 2149.0
Nov 3231.0
Dec 3461.0
Name: Sales, dtype: float64
如果按n>2
滚动,则两种解决方案都存在差异:
df['Sales1'] = df['Sales'] * 2
df1 = df.rolling(window=3).mean().combine_first(df)
print (df1)
Sales Sales1
Jan 1808.000000 3616.000000
Feb 1251.000000 2502.000000 <-diff
Mar 2027.333333 4054.666667
Apr 3043.666667 6087.333333
May 3462.000000 6924.000000
Jun 3272.000000 6544.000000
Jul 2046.333333 4092.666667
Aug 2624.000000 5248.000000
Sep 2392.666667 4785.333333
Oct 2845.666667 5691.333333
Nov 2740.333333 5480.666667
Dec 3153.666667 6307.333333
df2 = df.rolling(window=3, min_periods=1).mean()
print (df2)
Sales Sales1
Jan 1808.000000 3616.000000
Feb 1529.500000 3059.000000 <-diff
Mar 2027.333333 4054.666667
Apr 3043.666667 6087.333333
May 3462.000000 6924.000000
Jun 3272.000000 6544.000000
Jul 2046.333333 4092.666667
Aug 2624.000000 5248.000000
Sep 2392.666667 4785.333333
Oct 2845.666667 5691.333333
Nov 2740.333333 5480.666667
Dec 3153.666667 6307.333333