Question

问题陈述

我有一个很长的信号（454912个样本），我想计算其中50Hz的估计值。速度比精度更重要。预计50Hz的量随时间波动。该值必须代表整个记录，例如平均值。

上下文

从EEG电极记录信号。当EEG电极与头皮接触不良时，信号中会有大量50Hz的电力线噪声。我想丢弃来自EEG电极的所有数据，这些数据的噪声比其他电极高50Hz。

尝试了解决方案

解决问题并不难。可以使用从FFT到韦尔奇方法的任何方法来估计功率谱：

import numpy as np
from scipy.signal import welch

# generate an example signal
sfreq = 128.
nsamples = 454912
time = np.arange(nsamples) / sfreq
x = np.sin(2 * np.pi * 50 * time) + np.random.randn(nsamples)

# apply Welch' method to the problem
fs, ps = welch(x, sfreq)
print 'Amount of 50Hz:', ps[np.searchsorted(fs, 50)]

然而，在这里计算所有频率的功率似乎是不必要的，我觉得有一个更有效的解决方案。计算单个DFFT步骤的东西是什么？用一些sinoid小波卷积？

Answer 1

Welch's method只计算信号的多个重叠段的周期图，然后获取段间的平均值。这有效地改变了频域中降噪的分辨率。

然而，为每个小段执行大量单独的FFT将比为较大的段计算较少的FFT更昂贵。根据您的需要，您可以使用Welch的方法，但将信号分成更大的段，和/或它们之间的重叠更少（两者都可以减少PSD中的方差减少）。

from matplotlib import pyplot as plt

# default parameters
fs1, ps1 = welch(x, sfreq, nperseg=256, noverlap=128)

# 8x the segment size, keeping the proportional overlap the same
fs2, ps2 = welch(x, sfreq, nperseg=2048, noverlap=1024)

# no overlap between the segments
fs3, ps3 = welch(x, sfreq, nperseg=2048, noverlap=0)

fig, ax1 = plt.subplots(1, 1)
ax1.hold(True)
ax1.loglog(fs1, ps1, label='Welch, defaults')
ax1.loglog(fs2, ps2, label='length=2048, overlap=1024')
ax1.loglog(fs3, ps3, label='length=2048, overlap=0')
ax1.legend(loc=2, fancybox=True)

enter image description here

增加分段大小并减少重叠量可以显着提高性能：

In [1]: %timeit welch(x, sfreq)
1 loops, best of 3: 262 ms per loop

In [2]: %timeit welch(x, sfreq, nperseg=2048, noverlap=1024)
10 loops, best of 3: 46.4 ms per loop

In [3]: %timeit welch(x, sfreq, nperseg=2048, noverlap=0)
10 loops, best of 3: 23.2 ms per loop

请注意，对窗口大小使用2的幂是个好主意，因为对于长度为2的幂的信号进行FFT更快。

更新

您可能会考虑的另一件简单事情就是使用以50Hz为中心的陷波滤波器对信号进行带通滤波。滤波后的信号包络将为您提供信号随时间变化所需的50Hz功率的测量值。

from scipy.signal import filter_design, filtfilt

# a signal whose power at 50Hz varies over time
sfreq = 128.
nsamples = 454912
time = np.arange(nsamples) / sfreq
sinusoid = np.sin(2 * np.pi * 50 * time)
pow50hz = np.zeros(nsamples)
pow50hz[nsamples / 4: 3 * nsamples / 4] = 1
x = pow50hz * sinusoid + np.random.randn(nsamples)

# Chebyshev notch filter centered on 50Hz
nyquist = sfreq / 2.
b, a = filter_design.iirfilter(3, (49. / nyquist, 51. / nyquist), rs=10,
                               ftype='cheby2')

# filter the signal
xfilt = filtfilt(b, a, x)

fig, ax2 = plt.subplots(1, 1)
ax2.hold(True)
ax2.plot(time[::10], x[::10], label='Raw signal')
ax2.plot(time[::10], xfilt[::10], label='50Hz bandpass-filtered')
ax2.set_xlim(time[0], time[-1])
ax2.set_xlabel('Time')
ax2.legend(fancybox=True)

enter image description here

更新2

看过@ hotpaw2的答案后，我决定尝试实施Goertzel algorithm，只是为了好玩。不幸的是它是一个递归算法（因此不能随着时间的推移进行矢量化），所以我决定自己写一个Cython版本：

# cython: boundscheck=False
# cython: wraparound=False
# cython: cdivision=True

from libc.math cimport cos, M_PI

cpdef double goertzel(double[:] x, double ft, double fs=1.):
    """
    The Goertzel algorithm is an efficient method for evaluating single terms
    in the Discrete Fourier Transform (DFT) of a signal. It is particularly
    useful for measuring the power of individual tones.

    Arguments
    ----------
        x   double array [nt,]; the signal to be decomposed
        ft  double scalar; the target frequency at which to evaluate the DFT
        fs  double scalar; the sample rate of x (same units as ft, default=1)

    Returns
    ----------
        p   double scalar; the DFT coefficient corresponding to ft

    See: <http://en.wikipedia.org/wiki/Goertzel_algorithm>
    """

    cdef:
        double s
        double s_prev = 0
        double s_prev2 = 0
        double coeff = 2 * cos(2 * M_PI * (ft / fs))
        Py_ssize_t N = x.shape[0]
        Py_ssize_t ii

    for ii in range(N):
        s = x[ii] + (coeff * s_prev) - s_prev2
        s_prev2 = s_prev
        s_prev = s

    return s_prev2 * s_prev2 + s_prev * s_prev - coeff * s_prev * s_prev2

以下是它的作用：

freqs = np.linspace(49, 51, 1000)
pows = np.array([goertzel(x, ff, sfreq) for ff in freqs])

fig, ax = plt.subplots(1, 1)
ax.plot(freqs, pows, label='DFT coefficients')
ax.set_xlabel('Frequency (Hz)')
ax.legend(loc=1)

enter image description here

速度非常快：

In [1]: %timeit goertzel(x, 50, sfreq)
1000 loops, best of 3: 1.98 ms per loop

显然，如果你只对一个频率而不是一系列频率感兴趣，这种方法才有意义。

Answer 2

对于单个正弦频率，您可以使用Goertzel算法或Goertzel滤波器，这是计算DFT或FFT结果的单个bin的幅度的计算有效方式。

您可以在整个波形上运行此滤波器，或者将其与Welch的方法结合使用，方法是将一系列Goertzel滤波器的幅度输出相加，并选择滤波器长度以使滤波器带宽不为＃39;太窄（以覆盖50 Hz的可能轻微频率变化与您的采样率）。

通常将Goertzel滤波器与功率估算器结合使用，以确定所选频率的SNR是否有效。

有效地计算50Hz的信号含量

问题陈述

上下文

尝试了解决方案

2 个答案:

更新

更新2