当存在nan值且adjust = false时,ewm平均值如何计算?

时间:2019-03-15 07:45:56

标签: python pandas numba

为了优化熊猫的EWM均值计算,我使用numba库将其复制。但是,当存在nan值时,我无法弄清楚如何进行计算。

文档规定以下内容:

  

当ignore_na为False(默认值)时,权重基于绝对位置。例如,用于计算...(1-alpha)** 2和alpha的最终加权平均值的x和y的权重(如果Adjust为False)。

如果将span的{​​{1}}设置为2,则表示第三个EMA值的计算方式为:

[1, None, 2]

为1.6666。但是,执行alpha = 2 / (2 + 1) ((1 - alpha)**2) * 1 + alpha * 2 时的实际值为1.85714286。

对于nan值,确切的公式是什么?上面的公式没有太大意义,因为权重不相等-两个权重之和等于1会更有意义。

1 个答案:

答案 0 :(得分:0)

在numpy中检查以下版本的panda ewm.mean()。希望这会有所帮助。

@jit((float64[:], float64, boolean, boolean), nopython=True, nogil=True)
def _numba_ema(X, alpha, adjust, ignore_na):
    """Exponentialy weighted moving average specified by a decay ``alpha``

    Reference:
    https://stackoverflow.com/questions/42869495/numpy-version-of-exponential-weighted-moving-average-equivalent-to-pandas-ewm

    Example:
        >>> ignore_na = True     # or False
        >>> adjust = True     # or False
        >>> myema = _numba_ema_adjusted(X, alpha=alpha, ignore_na=ignore_na)
        >>> pdema = pd.Series(X).ewm(alpha=alpha, adjust=adjust, ignore_na=ignore_na).mean().values
        >>> print(np.allclose(myema, pdema, equal_nan=True))
        True

    Args:
        X (array): raw data
        alpha (float): decay factor
        adjust (boolean):
            True for assuming infinite history via the recursive form
            False for assuming finite history via the recursive form
        ignore_na (boolean): True for decaying by relative location, False for absolute location

    Returns:
        TYPE: Description
    """
    ewma = np.empty_like(X, dtype=float64)
    offset = 1
    w = 1
    for i, x in enumerate(X):
        if i == 0:
            ewma[i] = x
            ewma_old = x
        else:
            is_ewma_nan = math.isnan(ewma[i - 1])
            is_x_nan = math.isnan(x)
            if is_ewma_nan and is_x_nan:
                ewma[i] = np.nan
            elif is_ewma_nan:
                ewma[i] = x
                ewma_old = x
            elif is_x_nan:
                offset += 1
                ewma[i] = ewma[i - 1]
            else:
                if ignore_na:
                    if adjust:
                        w = w * (1 - alpha) + 1
                        ewma_old = ewma_old * (1 - alpha) + x
                        ewma[i] = ewma_old / w
                    else:
                        ewma[i] = ewma[i - 1] * (1 - alpha) + x * alpha
                else:
                    if adjust:
                        w = w * (1 - alpha) ** offset + 1
                        ewma_old = ewma_old * (1 - alpha) ** offset + x
                        ewma[i] = ewma_old / w
                    else:
                        ewma[i] = (ewma[i - 1] * (1 - alpha) ** offset + x * alpha) / ((1 - alpha) ** offset + alpha)
                    offset = 1
    return ewma