xarray:使用.construct()的简单加权滚动平均示例

时间:2018-05-25 02:53:43

标签: pandas python-xarray

Xarray可以通过.construct()对象执行加权滚动均值,如SO heredocs的回答中所述。

文档中的加权滚动平均值示例看起来并不正确,因为它似乎给出了与普通滚动平均值相同的答案。

import xarray as xr
import numpy as np

arr = xr.DataArray(np.arange(0, 7.5, 0.5).reshape(3, 5),
...                dims=('x', 'y'))
arr.rolling(y=3, center=True).mean()
#<xarray.DataArray (x: 3, y: 5)>
#array([[nan, 0.5, 1. , 1.5, nan],
#       [nan, 3. , 3.5, 4. , nan],
#       [nan, 5.5, 6. , 6.5, nan]])
#Dimensions without coordinates: x, y

weight = xr.DataArray([0.25, 0.5, 0.25], dims=['window'])
arr.rolling(y=3, center=True).construct('window').dot(weight)
#<xarray.DataArray (x: 3, y: 5)>
#array([[nan, 0.5, 1. , 1.5, nan],
#       [nan, 3. , 3.5, 4. , nan],
#       [nan, 5.5, 6. , 6.5, nan]])
#Dimensions without coordinates: x, y

这是一个更简单的例子,我想直接获得语法:

da = xr.DataArray(np.arange(1,6), dims='x')
da.rolling(x=3, center=True).mean()
#<xarray.DataArray (x: 5)>
#array([nan,  2.,  3.,  4., nan])
#Dimensions without coordinates: x

weight = xr.DataArray([0.5, 1, 0.5], dims=['window'])
da.rolling(x=3, center=True).construct('window').dot(weight)
#<xarray.DataArray (x: 5)>
#array([nan,  4.,  6.,  8., nan])
#Dimensions without coordinates: x

它返回4,6,8。我以为它会这样做:

(1 x 0.5) + (2 x 1) + (3 x 0.5) / 3 = 4/3
(2 x 0.5) + (3 x 1) + (4 x 0.5) / 3 = 2
(3 x 0.5) + (4 x 1) + (5 x 0.5) / 3 = 8/3
1.33, 2. 2.66

1 个答案:

答案 0 :(得分:1)

在第一个示例中,您使用arr的均匀间隔数据。 因此,加权平均值([0.25,5,50])将与简单mean相同。

如果考虑非线性数据,结果会有所不同

In [50]: arr = xr.DataArray((np.arange(0, 7.5, 0.5)**2).reshape(3, 5),
    ...:                    dims=('x', 'y'))
    ...:                    

In [51]: arr.rolling(y=3, center=True).mean()
Out[51]: 
<xarray.DataArray (x: 3, y: 5)>
array([[      nan,  0.416667,  1.166667,  2.416667,       nan],
       [      nan,  9.166667, 12.416667, 16.166667,       nan],
       [      nan, 30.416667, 36.166667, 42.416667,       nan]])
Dimensions without coordinates: x, y

In [52]: weight = xr.DataArray([0.25, 0.5, 0.25], dims=['window'])
    ...: arr.rolling(y=3, center=True).construct('window').dot(weight)
    ...: 
Out[52]: 
<xarray.DataArray (x: 3, y: 5)>
array([[   nan,  0.375,  1.125,  2.375,    nan],
       [   nan,  9.125, 12.375, 16.125,    nan],
       [   nan, 30.375, 36.125, 42.375,    nan]])
Dimensions without coordinates: x, y

对于第二个示例,使用[0.5,1,0.5]作为权重,总计为2。 因此,第一个非纳米项目将是 (1 x 0.5) + (2 x 1) + (3 x 0.5) = 4

如果你想要加权平均值而不是加权和,请改用[0.25,0.5,0.25]。