Question

我正在测试SO中的一些autocorrelation实现。我在Joe Kington和unubtu找到了两个很棒的答案。除了使用的规范化之外，两者都非常相似。

前者使用max值，后者使用variance。对于一些随机统一数据，结果完全不同，如下所示。

import numpy as np
from statsmodels.tsa.stattools import acf
import matplotlib.pyplot as plt

def acorr_unutbu(x):
    x = x - x.mean()
    autocorr = np.correlate(x, x, mode='full')[-x.size:]
    # Normalization
    autocorr /= (x.var() * (np.arange(x.size, 0, -1)))
    return autocorr

def acorr_joe(x):
    x = x - x.mean()
    # The original answer uses [x.size:], I changed it to [-x.size:] to match
    # the size of the other function
    autocorr = np.correlate(x, x, mode='full')[-x.size:]
    # Normalization
    autocorr /= autocorr.max()
    return autocorr

N = 1000
data = np.random.rand(N)

ac_joe = acorr_joe(data)
ac_unubtu = acorr_unutbu(data)

fig, axes = plt.subplots(nrows=2)
axes[0].plot(ac_joe, label="joe")
axes[0].legend()
axes[1].plot(ac_unubtu, c='r', label="unutbu")
axes[1].legend()
plt.show()

我可以将这两个函数与statsmodels autocorrelation function acf进行比较，这表明Joe的答案（上面代码中显示了一个小修改）可能正在使用正确的规范化。

# Compare with statsmodels lags
lags = acf(data, nlags=N)

fig, axes = plt.subplots(nrows=2)
axes[0].plot(ac_joe - lags, label="joe - SM")
axes[0].set_ylim(-.5, .5)
axes[0].legend()
axes[1].plot(ac_unubtu - lags, c='r', label="unutbu - SM")
axes[1].set_ylim(-.5, .5)
axes[1].legend()
plt.show()

在这两个自相关函数中使用不同的归一化值的原因是什么？

不同的自相关归一化值与statsmodels

0 个答案: