Question

我有一个数据框，我可以使用旧式滚动语法估算各种类型的10年滚动平均值：

`pandas.rolling_mean(df['x'], 10)`, 
`pandas.rolling_median(df['x'], 10)`

和

`pandas.rolling_apply(df['x'],10, hodgesLehmanMean)`,

hodgesLehman的意思是我写的一个函数（见下文）。

def hodgesLehmanMean(x):
    #Computes the Hodges-Lehman mean = median { [x_i + x+j]/2 }. 
    #Robust to 29% outliers, with high (95% efficiency) in the gaussian case

    N = len(x)
    return 0.5 * numpy.median(x[i] + x[j] for i in range(N) for j in range(i+1,N))
`

现在旧的滚动功能已被弃用，我试图用新样式series.rolling（）样式重写我的代码，即：

`df['x'].rolling(window=10).mean()`, 
`df['x'].rolling(window=10).median()`
 and 
`df['x'].rolling(window=10).hodgesLehmanMean()`.

前两个（平均值和中位数）就像魅力一样。第三个（hodgesLehmanMean）不起作用 - 它引发AttributeError: 'Rolling' object has no attribute 'hodgesLehmanMean

如何让我的函数使用新的series.rolling语法？

Answer 1

您可以致电Rolling.apply / agg：

df['x'].rolling(window=10).agg(hodgesLehmanMean)

另请注意，在您的函数中，您希望将列表传递给np.median，而不是生成器：

def hodgesLehmanMean(x): 
    return 0.5 * np.median([x[i] + x[j] 
                           for i in range(len(x)) 
                           for j in range(i+1,len(x))])

要更快地实施hodgesLehmanMean，请查看unutbu's answer之一的旧问题here。

熊猫：使用具有用户功能的滚动窗口

1 个答案: