我下载了Voicebox for Matlab,并希望获得1)每个帧的估计噪声频谱(正是这些函数的输出),2)总体SNR 3)每个帧的即时后验SNR ,但结果显然是错误的。
结果很奇怪,例如总体SNR为10, 当我的输入实际上是具有0dB SNR(高斯噪声)的嘈杂语音时; 当true snr为5时,总SNR变为19
我的代码:
[s,fs] = audioread('audiofile.wav');
s = awgn(s,0,'measured');
ninc=round(0.016*fs); % frame increment [fs=sample frequency]
ovf=2; % overlap factor
f = v_rfft( v_enframe(s, v_windows(2,ovf*ninc,'l'), ninc), ovf*ninc, 2 );
f = f.*conj(f); % noisy signals converted to power spectrum
% enframe is to split up data by designated frame numbers
% to be honest I don't know why it doesn't just use "spectrogram" to get a STFT of noisy signals
% estimate the noise power spectrum
% one frmae per row
x = v_estnoisem(f,ninc/fs);
% a list containing instant SNRs of each frame
instSNR = 10*log10(sum(f'.^2)./sum(x'.^2));
overallSNR = 10*log10(sum(sum(f.^2))/sum(sum(x.^2)))
我不知道我是否会滥用它,因为它可能包含模糊的自定义函数,或者仅仅是代码问题。如果您能给我一些提示,我将非常感激!