使用MATLAB绘制谐波产品谱图

时间:2013-11-19 04:04:06

标签: matlab

我使用谐波产品频谱来查找存在多个谐波时的基本音符。这是我实施的代码;

[song,FS] = wavread('C major.wav');
%sound(song,FS);

P = 20000;
N=length(song);                     % length of song
t=0:1/FS:(N-1)/FS;                  % define time period

song = sum(song,2);
song=abs(song);

%----------------------Finding the envelope of the signal-----------------%
% Gaussian Filter
w = linspace( -1, 1, P);                      % create a vector of P values between -1 and 1 inclusive
sigma = 0.335;                                % standard deviation used in Gaussian formula
myFilter = -w .* exp( -(w.^2)/(2*sigma.^2));  % compute first derivative, but leave constants out
myFilter = myFilter / sum( abs( myFilter ) ); % normalize

% fft convolution
myFilter = myFilter(:);                         % create a column vector
song(length(song)+length(myFilter)-1) = 0;      %zero pad song
myFilter(length(song)) = 0;                     %zero pad myFilter

edges =ifft(fft(song).*fft(myFilter));
tedges=edges(P:N+P-1);                      % shift by P/2 so peaks line up w/ edges
tedges=tedges/max(abs(tedges));                 % normalize

%---------------------------Onset Detection-------------------------------%
% Finding peaks
maxtab = [];
mintab = [];
x = (1:length(tedges));
min1 = Inf;
max1 = -Inf;
min_pos = NaN; 
max_pos = NaN;

lookformax = 1;
for i=1:length(tedges)

    peak = tedges(i:i);
  if peak > max1, 
      max1 = peak;
      max_pos = x(i); 
  end
  if peak < min1, 
      min1 = peak;
      min_pos = x(i); 
  end

  if lookformax
    if peak < max1-0.07
      maxtab = [maxtab ; max_pos max1];
      min1 = peak; 
      min_pos = x(i);
      lookformax = 0;
    end  
  else
    if peak > min1+0.08
      mintab = [mintab ; min_pos min1];
      max1 = peak; 
      max_pos = x(i);
      lookformax = 1;
    end
  end
end


max_col = maxtab(:,1);
peaks_det = max_col/FS;
No_of_peaks = length(peaks_det);

[song,FS] = wavread('C major.wav');
song = sum(song,2);

%---------------------------Performing STFT--------------------------------%
h = 1;
%for i = 2:No_of_peaks

    song_seg = song(max_col(7-1):max_col(7)-1);
    L = length(song_seg); 
    NFFT = 2^nextpow2(L); % Next power of 2 from length of y
    seg_fft = fft(song_seg,NFFT);%/L;

    f = FS/2*linspace(0,1,NFFT/2+1);
    seg_fft_2 = 2*abs(seg_fft(1:NFFT/2+1));
    L5 = length(song_seg);

    figure(6)
    plot(f,seg_fft_2)
    %plot(1:L/2,seg_fft(1:L/2))
    title('Frequency spectrum of signal (seg_fft)')
    xlabel('Frequency (Hz)')
    xlim([0 2500])
    ylabel('|Y(f)|')
    ylim([0 500])

%----------------Performing Harmonic Product Spectrum---------------------%

   % In harmonic prodcut spectrum, you downsample the fft data several times and multiply all those with the original fft data to get the maximum peak. 
    %HPS
    seg_fft = seg_fft(1 : size(seg_fft,1)/2 ); 
    seg_fft = abs(seg_fft);
    a = length(seg_fft);


    seg_fft2 = ones(size(seg_fft));
    seg_fft3 = ones(size(seg_fft));
    seg_fft4 = ones(size(seg_fft));
    seg_fft5 = ones(size(seg_fft));



    for i = 1:((length(seg_fft)-1)/2)
        seg_fft2(i,1) = seg_fft(2*i,1);%(seg_fft(2*i,1) + seg_fft((2*i)+1,1))/2;
    end

     %b= size(seg_fft2)

    L1 = length(seg_fft2); 
    NFFT1 = 2^nextpow2(L1); % Next power of 2 from length of y


    f1 = FS/2*linspace(0,1,NFFT1/2+1);
    seg_fft12 = 2*abs(seg_fft2(1:NFFT1/2+1));

    figure(7);
    plot(f1,seg_fft12)
    title('Frequency spectrum of signal (seg_fft2)')
    xlabel('Frequency (Hz)')
    xlim([0 2500])
    ylabel('|Y(f)|')
    ylim([0 500])

这是图6 enter image description here

的图

所以在实际情况下,一旦我执行HPS(下采样为2),440.1处的峰值应该下移到220,而881处的峰值应该下降到440左右。但是当我绘制的图表不是我得到的。插入这是我得到的图表, enter image description here

为什么我没有得到正确的图表????我似乎不明白我在这里做错了什么...有人可以看看,让我知道..谢谢.....

1 个答案:

答案 0 :(得分:1)

下采样的问题在于,在进行下采样之前将矢量调整2倍,而不是之后。你做了

seg_fft = seg_fft(1 : size(seg_fft,1)/2 ); 

% [... other stuff ...]
for i = 1:((length(seg_fft)-1)/2)
    seg_fft2(i,1) = seg_fft(2*i,1);%(seg_fft(2*i,1) + seg_fft((2*i)+1,1))/2;
end

相反,您需要首先进行下采样,然后修剪:

for i = 1:((length(seg_fft)-1)/2)
    seg_fft2(i,1) = seg_fft(2*i,1);%(seg_fft(2*i,1) + seg_fft((2*i)+1,1))/2;
end

seg_fft = seg_fft(1 : size(seg_fft,1)/2 ); 

更新您问为什么这不会保留峰值。简短的回答是你可能没有“看”峰值。如果您希望在下采样n期间保留(最近)峰值,则可以执行以下操作:

n = 3; % degree of decimation or downsampling we want to do
N = size(seg_fft, 1); % number of samples in original FFT
Nn = n * floor(N/n);  % number of samples that can be divided by n

fftBlock = reshape(seg_fft(1:Nn, 1), n, N);
fftResampled = max(fftBlock);

这是如何工作的?让我们使用10 x 1点的简单示例:

seg_fft = [0 1 10 5 4 3 6 12 4 3];

我们想要“每3个”。朴素算法会给出

fftResampled = [2 3 7];

但我们会喜欢“峰值”[10 3 12] - 不幸的是它们不在正确的位置。

重塑数组后(丢失最后一个元素;如果它可能是一个有趣的值,我们可以追加并用零填充),我们得到:

fftBlock = [0  5  6;
            1  4  3;
           10  3  4];

(请记住,Matlab矩阵是行先行的)

现在取max(除非我们另有说明,否则该功能将沿第一维运作)你得到

fftResampled = [10 5 6];

即。总是最高峰。虽然这保留了峰值,但它确实意味着你的“山谷”正在填补一点。

结论:在下采样的过程中,没有办法不破坏“某些”信息 - 毕竟,你扔掉了一半的样本。您保留的内容,以及您如何对丢弃的数据中的信息内容进行说明,只有您可以自行决定,因为这取决于您的申请,以及对您来说重要的内容。