Question

我已经在python中实现了一个函数，用于计算在特定滞后k处的时间序列的自相关。它是在某些时间序列可能不固定的前提下实现的。但是我发现对于其中一些我获得的值大于1，尤其是在最后的滞后。所以我想我的一些计算一定会出错。

我正在执行以下操作：

对于与滞后序列相对应的术语，我正在计算从滞后k开始的均值和标准偏差。

我已经在python中实现了以下代码，该代码针对特定滞后k计算自相关：

def custom_autocorrelation(x, lag = 12):
    n = len(x)
    std = x.std()
    mu = x.mean() 
    autocov = 0
    mu_lag = x[lag:].mean() 
    std_lag = x[lag:].std() 
    for j in range(n-lag):
        autocov += (x[j] - mu)*(x[j+lag] - mu_lag)
    autocorr = autocov/(std*std_lag*(n-lag))
    return autocorr

作为示例，我尝试使用以下序列，对于k = 12，得到的系数为1.03：

np.array([20623., 11041.,  5686.,  2167.,  2375.,  2057.,  3141.,   504.,
         152.,  6562.,  8199., 15103., 16632.,  7190.,  6987.,  2652.,
        1949.,  2223.,  1703.,  2163.,  1850.,  6932.,  5932., 13124.,
       14846.,  7850.,  4526.,  1277.,  1036.,  1500.,  1648.,  1384.,
        1446.,  3477.,  6818., 12446.,  9734.])

任何帮助将不胜感激！

Answer 1

我认为您只是错误地写下了方程式。以下部分

std = x.std()
mu = x.mean()

与原纸不符。看来您需要

std = x[: n - lag].std()
mu = x[: n - lag].mean()

解决此问题

In [221]: custom_autocorrelation(a, 12)
Out[221]: 0.9569497673729846

我还从我的previous answer中吸取了一些想法，以大大加快计算速度

def modified_acorr(ts, lag):
    """An autocorrelation estimation as per
    http://itfeature.com/time-series-analysis-and-forecasting/autocorrelation-time-series-data

    Args:
        ts (np.ndarray): series
        lag (int): the lag

    Returns:
        float: The autocorrelation
    """
    return (
        (ts[:ts.size - lag] - ts[:ts.size - lag].mean()) *
        (ts[lag:] - ts[lag:].mean())
    ).ravel().mean() / (ts[lag:].std() * ts[:ts.size - lag].std())

与常规自相关函数相比，我们得到了相似的答案

In [197]: modified_acorr(a, 12)
Out[197]: 0.9569497673729849

In [218]: acorr(a, a.mean(), 12) / acorr(a, a.mean(), 0)  # normalisation
Out[218]: 0.9201920561073853

非平稳时间序列的自相关

1 个答案: