在熊猫的窗口重叠

时间:2013-08-15 05:47:12

标签: python numpy pandas

在pandas中,有几种方法可以处理给定窗口中的数据(例如pd.rolling_meanpd.rolling_std。)但是,我想设置一个窗口重叠,我认为这是一个很好的标准要求。例如,在下图中,您可以看到一个跨越256个样本并重叠128个样本的窗口。

http://health.tau.ac.il/Communication%20Disorders/noam/speech/mistorin/images/hamming_overlap1.JPG

如何使用Pandas或Numpy中包含的优化方法?

2 个答案:

答案 0 :(得分:6)

使用as_strided你可以这样做:

import numpy as np
from numpy.lib.stride_tricks import as_strided

def windowed_view(arr, window, overlap):
    arr = np.asarray(arr)
    window_step = window - overlap
    new_shape = arr.shape[:-1] + ((arr.shape[-1] - overlap) // window_step,
                                  window)
    new_strides = (arr.strides[:-1] + (window_step * arr.strides[-1],) +
                   arr.strides[-1:])
    return as_strided(arr, shape=new_shape, strides=new_strides)

如果您将1D数组传递给上述函数,它会将2D视图返回到该数组,形状为(number_of_windows, window_size),因此您可以计算,例如窗口意思是:

win_avg = np.mean(windowed_view(arr, win_size, win_overlap), axis=-1)

例如:

>>> a = np.arange(16)
>>> windowed_view(a, 4, 2)
array([[ 0,  1,  2,  3],
       [ 2,  3,  4,  5],
       [ 4,  5,  6,  7],
       [ 6,  7,  8,  9],
       [ 8,  9, 10, 11],
       [10, 11, 12, 13],
       [12, 13, 14, 15]])
>>> windowed_view(a, 4, 1)
array([[ 0,  1,  2,  3],
       [ 3,  4,  5,  6],
       [ 6,  7,  8,  9],
       [ 9, 10, 11, 12],
       [12, 13, 14, 15]])

答案 1 :(得分:1)

我不熟悉大熊猫,但是在numpy中你会做到这样的事情(未经测试):

def overlapped_windows(x, nwin, noverlap = None):
    if noverlap is None:
        noverlap = nwin // 2
    step = nwin - noverlap
    for i in range(0, len(x) - nwin + 1, step):
        window = x[i:i+nwin] #this is a view, not a copy
        y = window * hann(nwin)
        #your code here with y

这是从一些旧代码中删除以计算平均PSD,您通常使用半重叠窗口处理。请注意window是数组x的'视图',这意味着它不会复制数据(非常快,所以可能很好),如果你修改window,你也会修改{{1} (所以不要x)。