.wav文件上的快速傅立叶变换产生奇怪的结果

时间:2018-12-02 19:53:12

标签: c++ sdl fft

我正在为.wav文件编写一个频率可视化工具。我正在使用SDL从文件中提取数据,因此可以进行快速傅立叶变换来计算每个频率的幅度。

据我所知,我给出的FFT都是正确的,但是该函数会吐出奇怪的结果,我不知道为什么。

当我 做一次窗口函数时,我得到了正确的结果,然后得到了5个不正确的结果,它们接近于20000Hz,看似随机的,所以我无法预测它们的含义。我怀疑这可能是由于频谱泄漏造成的,所以我在转换之前尝试并运行了开窗函数,但此后每次迭代我得到的频率都是错误的。

注意:这没有窗口功能

当我使用此音频https://www.youtube.com/watch?v=qNf9nzvnd1k运行它时,这是一个测试:

30 20480 20480 20470 20490 20480 20480 20480 x 20470 20480 20480 20480 20480 20480 20480 20480 20480 x 20480 20480 20480 20480 20480 20480 20480 x 20480 20480 20480 20480 20480 20480 20480 20470 20480 20480 20480 20480 20480 20480 20470 20480 20480 20480 20480 20480 20480 20480 20480 20480 20480 20480 20480 x 20480 x 20470 x

此处的 x 将是正确的值,该值会根据整个剪辑中期望的值而变化。

我的问题是每次迭代都需要一个正确的值,否则我会得到草率的动画。

main.cpp:

#include "visualization.h"
#define FILE_PATH "audiosamples/test.wav" //https://www.youtube.com/watch?v=qNf9nzvnd1k
#define PI 3.14159265359

/*
freq = i * Fs / N;      (1)
where,
freq = frequency in Hertz,
i = index (position of DFT output or can also think of it as representing the number of cycles)
Fs = sampling rate of audio,
N = size of FFT buffer or array.

To explain further, lets say that:

N = 4096          //a buffer that holds 4096 audio data samples
Fs = 44100       //a common sample rate [frames per sec] for audio signals: 44.1 kHz

The spectral bin numbers aka frequency bins using equation (1) from above would be:

    bin:      i      Fs         N            freq
     0  :     0  *  44100 /  2048  =        0.0 Hz
     1  :     1  *  44100 /  2048  =        21.5 Hz
     2  :     2  *  44100 /  2048  =        43 Hz
     3  :     3  *  44100 /  2048  =        64.5 Hz
     4  :     ...
     5  :     ...

   1024 :    1024 * 44100 /  2048  =        22.05 kHz
*/

SDL *visualization = nullptr;

Uint8* sampData;
SDL_AudioSpec wavSpec;
SDL_AudioSpec obtained;
Uint8* wavStart;
Uint32 wavLength;
SDL_AudioDeviceID aDevice;
double arrSamples[4096];
double max_magnitude_index;


struct AudioData {
    Uint8* filePosition;
    Uint32 fileLength;
};

void PlayAudioCallback(void* userData, Uint8* stream, int streamLength) {
    AudioData* audio = (AudioData*)userData;
    sampData = (Uint8*)stream;
    double magnitude[4096];
    fftw_complex x[4096], y[4096];
    double max_magnitude = 0;

    if (audio->fileLength == 0) {
        return;
    }

    Uint32 length = (Uint32)streamLength;
    length = (length > audio->fileLength ? audio->fileLength : length);
    std::vector<double> samples (stream, stream + length);

    for( int i = 0; i < 4095; i ++ ){
        double multiplier = 0.5 * (1 - cos((2*PI*i)/4095));
    //  x[i][REAL] = multiplier * (double)samples[i];
        x[i][REAL] = (double)samples[i];
        x[i][IMAG] = 0.0;
    //  std::cout << i << " - " << x[i][REAL] << std::endl << std::flush;
    }

    fftw_plan plan = fftw_plan_dft_1d( 4096, x, y,  FFTW_FORWARD, FFTW_ESTIMATE );
    fftw_execute(plan);

    for( int i = 0; i < 4095; i ++ ){
        if( y[i][IMAG] < 0 ){
            magnitude[i] = sqrt( y[i][REAL] * y[i][REAL] + y[i][IMAG] * y[i][IMAG] );
        }
        else{
            magnitude[i] = sqrt( y[i][REAL] * y[i][REAL] + y[i][IMAG] * y[i][IMAG] );
        }
    }
    for( int i = 1; i < 4095; i ++ ){
        if( magnitude[i] > max_magnitude ){
            max_magnitude = magnitude[i];
            max_magnitude_index = i;
        }
    }
    int freq = max_magnitude_index * ( 44100 / 4096 );
    if ( freq < 20000 ){
        std::cout << freq << std::endl << std::flush;
    }

//  SDL_memcpy(&in, sampData, sizeof(sampData));
    SDL_memcpy(stream, audio->filePosition, length);


    audio->filePosition += length;
    audio->fileLength -= length;

}

int main() {
    int cnt = 0;

    visualization = new SDL();
    visualization -> init( "asd", 100, 0, 800, 400, false );
    SDL_Init(SDL_INIT_AUDIO);

    if (SDL_LoadWAV(FILE_PATH, &wavSpec, &wavStart, &wavLength) == NULL) {
        std::cerr << "Couldnt load file: " << FILE_PATH << std::endl;
        getchar();
    }
    std::cout << "Loaded " << FILE_PATH << std::endl;

    AudioData audio;
    audio.filePosition = wavStart;
    audio.fileLength = wavLength;
    wavSpec.samples = 4096;

    wavSpec.callback = PlayAudioCallback;
    wavSpec.userdata = &audio;


    aDevice = SDL_OpenAudioDevice( NULL, 0, &wavSpec, &obtained, SDL_AUDIO_ALLOW_FREQUENCY_CHANGE);
    if (aDevice == 0) {
        std::cerr << "Audio Device connection failed: " << SDL_GetError() << std::endl;
        getchar();
    }
    SDL_PauseAudioDevice(aDevice, 0);

    std::cout << obtained.samples << std::endl << std::flush;

    while( visualization -> running () ){
        visualization -> handleEvents();
        visualization -> render();
        cnt ++;
        visualization -> update(max_magnitude_index, cnt);
    }

    visualization->clean();

    return 0;
}

0 个答案:

没有答案