Python中的可逆STFT和ISTFT

时间:2010-03-17 01:12:29

标签: python scipy fft signal-processing

是否存在short-time Fourier transform的通用形式,其中相应的逆变换内置于SciPy或NumPy或其他任何内容中?

matplotlib中有pyplot specgram函数,调用ax.specgram(),调用mlab.specgram(),调用_spectral_helper()

#The checks for if y is x are so that we can use the same function to
#implement the core of psd(), csd(), and spectrogram() without doing
#extra calculations.  We return the unaveraged Pxy, freqs, and t.

  

这是一个帮助函数,实现了它之间的通用性   204 #psd,csd和谱图。它是    NOT 意味着在mlab之外使用

我不确定这是否可用于执行STFT和ISTFT。还有什么,或者我应该翻译these MATLAB functions之类的内容吗?

我知道如何编写自己的临时实现;我只是在寻找功能齐全的东西,它可以处理不同的窗口函数(但是有一个合理的默认值),完全可以与COLA窗口(istft(stft(x))==x)完全颠倒,由多人测试,没有一个一个错误,处理结束和零填充,实际输入的快速RFFT实现等。

9 个答案:

答案 0 :(得分:60)

这是我的Python代码,简化了这个答案:

import scipy, pylab

def stft(x, fs, framesz, hop):
    framesamp = int(framesz*fs)
    hopsamp = int(hop*fs)
    w = scipy.hanning(framesamp)
    X = scipy.array([scipy.fft(w*x[i:i+framesamp]) 
                     for i in range(0, len(x)-framesamp, hopsamp)])
    return X

def istft(X, fs, T, hop):
    x = scipy.zeros(T*fs)
    framesamp = X.shape[1]
    hopsamp = int(hop*fs)
    for n,i in enumerate(range(0, len(x)-framesamp, hopsamp)):
        x[i:i+framesamp] += scipy.real(scipy.ifft(X[n]))
    return x

注意:

  1. list comprehension 是一个小技巧,我喜欢用它来模拟numpy / scipy中信号的块处理。这就像Matlab中的blkproc。我将命令(例如for)应用于列表推导内的每个信号帧,而不是fft循环,然后scipy.array将其转换为2D数组。我用它来制作光谱图,色谱图,MFCC-gram等等。
  2. 对于此示例,我在istft中使用了一个朴素的重叠和添加方法。为了重建原始信号,顺序窗口函数的总和必须是常数,优选地等于1(1.0)。在这种情况下,我选择了Hann(或hanning)窗口,并且50%的重叠完美无缺。有关详细信息,请参阅this discussion
  3. 可能有更多有原则的计算ISTFT的方法。这个例子主要是为了教育。
  4. 测试:

    if __name__ == '__main__':
        f0 = 440         # Compute the STFT of a 440 Hz sinusoid
        fs = 8000        # sampled at 8 kHz
        T = 5            # lasting 5 seconds
        framesz = 0.050  # with a frame size of 50 milliseconds
        hop = 0.025      # and hop size of 25 milliseconds.
    
        # Create test signal and STFT.
        t = scipy.linspace(0, T, T*fs, endpoint=False)
        x = scipy.sin(2*scipy.pi*f0*t)
        X = stft(x, fs, framesz, hop)
    
        # Plot the magnitude spectrogram.
        pylab.figure()
        pylab.imshow(scipy.absolute(X.T), origin='lower', aspect='auto',
                     interpolation='nearest')
        pylab.xlabel('Time')
        pylab.ylabel('Frequency')
        pylab.show()
    
        # Compute the ISTFT.
        xhat = istft(X, fs, T, hop)
    
        # Plot the input and output signals over 0.1 seconds.
        T1 = int(0.1*fs)
    
        pylab.figure()
        pylab.plot(t[:T1], x[:T1], t[:T1], xhat[:T1])
        pylab.xlabel('Time (seconds)')
    
        pylab.figure()
        pylab.plot(t[-T1:], x[-T1:], t[-T1:], xhat[-T1:])
        pylab.xlabel('Time (seconds)')
    

    STFT of 440 Hz sinusoid ISTFT of beginning of 440 Hz sinusoid ISTFT of end of 440 Hz sinusoid

答案 1 :(得分:9)

这是我使用的STFT代码。 STFT + ISTFT在这里给出完美重建(即使是第一帧)。我稍微修改了Steve Tjoa给出的代码:这里重建信号的幅度与输入信号的幅度相同。

import scipy, numpy as np

def stft(x, fftsize=1024, overlap=4):   
    hop = fftsize / overlap
    w = scipy.hanning(fftsize+1)[:-1]      # better reconstruction with this trick +1)[:-1]  
    return np.array([np.fft.rfft(w*x[i:i+fftsize]) for i in range(0, len(x)-fftsize, hop)])

def istft(X, overlap=4):   
    fftsize=(X.shape[1]-1)*2
    hop = fftsize / overlap
    w = scipy.hanning(fftsize+1)[:-1]
    x = scipy.zeros(X.shape[0]*hop)
    wsum = scipy.zeros(X.shape[0]*hop) 
    for n,i in enumerate(range(0, len(x)-fftsize, hop)): 
        x[i:i+fftsize] += scipy.real(np.fft.irfft(X[n])) * w   # overlap-add
        wsum[i:i+fftsize] += w ** 2.
    pos = wsum != 0
    x[pos] /= wsum[pos]
    return x

答案 2 :(得分:3)

librosa.core.stftistft看起来与我正在寻找的非常相似,但当时并不存在:

  

librosa.core.stft(y, n_fft=2048, hop_length=None, win_length=None, window=None, center=True, dtype=<type 'numpy.complex64'>)

但是,它们并没有完全颠倒;两端是锥形的。

答案 3 :(得分:1)

找到另一个STFT,但没有相应的反函数:

http://code.google.com/p/pytfd/source/browse/trunk/pytfd/stft.py

def stft(x, w, L=None):
    ...
    return X_stft
  • w 是一个窗口函数,作为数组
  • L 是样本中的重叠

答案 4 :(得分:1)

上述任何一个答案都不适合OOTB。所以我修改了Steve Tjoa。

import scipy, pylab
import numpy as np

def stft(x, fs, framesz, hop):
    """
     x - signal
     fs - sample rate
     framesz - frame size
     hop - hop size (frame size = overlap + hop size)
    """
    framesamp = int(framesz*fs)
    hopsamp = int(hop*fs)
    w = scipy.hamming(framesamp)
    X = scipy.array([scipy.fft(w*x[i:i+framesamp]) 
                     for i in range(0, len(x)-framesamp, hopsamp)])
    return X

def istft(X, fs, T, hop):
    """ T - signal length """
    length = T*fs
    x = scipy.zeros(T*fs)
    framesamp = X.shape[1]
    hopsamp = int(hop*fs)
    for n,i in enumerate(range(0, len(x)-framesamp, hopsamp)):
        x[i:i+framesamp] += scipy.real(scipy.ifft(X[n]))
    # calculate the inverse envelope to scale results at the ends.
    env = scipy.zeros(T*fs)
    w = scipy.hamming(framesamp)
    for i in range(0, len(x)-framesamp, hopsamp):
        env[i:i+framesamp] += w
    env[-(length%hopsamp):] += w[-(length%hopsamp):]
    env = np.maximum(env, .01)
    return x/env # right side is still a little messed up...

答案 5 :(得分:0)

我也在GitHub上发现了这个,但它似乎在管道而不是普通数组上运行:

http://github.com/ronw/frontend/blob/master/basic.py#LID281

def STFT(nfft, nwin=None, nhop=None, winfun=np.hanning):
    ...
    return dataprocessor.Pipeline(Framer(nwin, nhop), Window(winfun),
                                  RFFT(nfft))


def ISTFT(nfft, nwin=None, nhop=None, winfun=np.hanning):
    ...
    return dataprocessor.Pipeline(IRFFT(nfft), Window(winfun),
                                  OverlapAdd(nwin, nhop))

答案 6 :(得分:0)

我认为scipy.signal有你想要的东西。它有合理的默认值,支持多种窗口类型等...

http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.signal.spectrogram.html

from scipy.signal import spectrogram
freq, time, Spec = spectrogram(signal)

答案 7 :(得分:0)

basj答案的固定版本。

import scipy, numpy as np
import matplotlib.pyplot as plt

def stft(x, fftsize=1024, overlap=4):
    hop=fftsize//overlap
    w = scipy.hanning(fftsize+1)[:-1]      # better reconstruction with this trick +1)[:-1]  
    return np.vstack([np.fft.rfft(w*x[i:i+fftsize]) for i in range(0, len(x)-fftsize, hop)])

def istft(X, overlap=4):   
    fftsize=(X.shape[1]-1)*2
    hop=fftsize//overlap
    w=scipy.hanning(fftsize+1)[:-1]
    rcs=int(np.ceil(float(X.shape[0])/float(overlap)))*fftsize
    print(rcs)
    x=np.zeros(rcs)
    wsum=np.zeros(rcs)
    for n,i in zip(X,range(0,len(X)*hop,hop)): 
        l=len(x[i:i+fftsize])
        x[i:i+fftsize] += np.fft.irfft(n).real[:l]   # overlap-add
        wsum[i:i+fftsize] += w[:l]
    pos = wsum != 0
    x[pos] /= wsum[pos]
    return x

a=np.random.random((65536))
b=istft(stft(a))
plt.plot(range(len(a)),a,range(len(b)),b)
plt.show()

答案 8 :(得分:-3)

如果您可以访问满足您需要的C二进制库,那么使用http://code.google.com/p/ctypesgen/生成该库的Python接口。