在numpy.correlate中指定lag

时间:2012-02-21 17:30:18

标签: python numpy

Matlab的互相关函数xcorr(x,y,maxlags)有一个选项maxlag,它返回滞后范围[-maxlags:maxlags]内的互相关序列。 Numpy的numpy.correlate(N,M,mode)有三种模式,但它们都不允许我设置特定延迟,这与完整(N+M-1),同一(max(M, N))或有效(max(M, N) - min(M, N) + 1 )不同。对于len(N) = 60000len (M) = 200,我想将滞后设置为100。

3 个答案:

答案 0 :(得分:1)

这是我对超前滞后相关性的实现,但它仅限于1-D而不是1-D 保证在效率方面是最好的。它使用scipy.stats.pearsonr进行核心计算,因此返回的是系数的p值。请根据这个稻草人修改优化。

def lagcorr(x,y,lag=None,verbose=True):
    '''Compute lead-lag correlations between 2 time series.

    <x>,<y>: 1-D time series.
    <lag>: lag option, could take different forms of <lag>:
          if 0 or None, compute ordinary correlation and p-value;
          if positive integer, compute lagged correlation with lag
          upto <lag>;
          if negative integer, compute lead correlation with lead
          upto <-lag>;
          if pass in an list or tuple or array of integers, compute 
          lead/lag correlations at different leads/lags.

    Note: when talking about lead/lag, uses <y> as a reference.
    Therefore positive lag means <x> lags <y> by <lag>, computation is
    done by shifting <x> to the left hand side by <lag> with respect to
    <y>.
    Similarly negative lag means <x> leads <y> by <lag>, computation is
    done by shifting <x> to the right hand side by <lag> with respect to
    <y>.

    Return <result>: a (n*2) array, with 1st column the correlation 
    coefficients, 2nd column correpsonding p values.

    Currently only works for 1-D arrays.
    '''

    import numpy
    from scipy.stats import pearsonr

    if len(x)!=len(y):
        raise('Input variables of different lengths.')

    #--------Unify types of <lag>-------------
    if numpy.isscalar(lag):
        if abs(lag)>=len(x):
            raise('Maximum lag equal or larger than array.')
        if lag<0:
            lag=-numpy.arange(abs(lag)+1)
        elif lag==0:
            lag=[0,]
        else:
            lag=numpy.arange(lag+1)    
    elif lag is None:
        lag=[0,]
    else:
        lag=numpy.asarray(lag)

    #-------Loop over lags---------------------
    result=[]
    if verbose:
        print '\n#<lagcorr>: Computing lagged-correlations at lags:',lag

    for ii in lag:
        if ii<0:
            result.append(pearsonr(x[:ii],y[-ii:]))
        elif ii==0:
            result.append(pearsonr(x,y))
        elif ii>0:
            result.append(pearsonr(x[ii:],y[:-ii]))

    result=numpy.asarray(result)

    return result

答案 1 :(得分:0)

我建议您查看this file以确定如何实施here所描述的相关性。

答案 2 :(得分:0)

matplotlib.xcorr有maxlags参数。它实际上是numpy.correlate的包装器,因此没有性能节省。然而,它给出了Matlab的互相关函数给出的完全相同的结果。下面我编辑了maxplotlib中的代码,以便它只返回相关性。原因是如果我们按原样使用matplotlib.corr,它也将返回该图。问题是,如果我们将复杂数据类型作为参数放入其中,当matplotlib尝试绘制绘图时,我们将得到“将复杂变为实数数据类型”警告。

<!-- language: python -->

import numpy as np
import matplotlib.pyplot as plt

def xcorr(x, y, maxlags=10):
    Nx = len(x)
    if Nx != len(y):
        raise ValueError('x and y must be equal length')

    c = np.correlate(x, y, mode=2)

    if maxlags is None:
        maxlags = Nx - 1

    if maxlags >= Nx or maxlags < 1:
        raise ValueError('maxlags must be None or strictly positive < %d' % Nx)

    c = c[Nx - 1 - maxlags:Nx + maxlags]

    return c