从FFT中寻找突出的频率

时间:2013-09-15 14:13:34

标签: matlab

我正在尝试使用STFT在音频信号(钢琴录音)中找到突出的峰值。这就是我到目前为止所做的  1.获取时域信号的包络  2.确定包络信号中的峰值并将其用作音符开始  3.对每2个连续起始点之间的样本执行FFT。

现在我有了FFT,我想找到与播放的音符相对应的峰值......当我尝试在某些点使用findpeaks函数时,它表示它是一个空矩阵。

clear all;
clear max;
clc;

[song,FS] = wavread('C major.wav');
sound(song,FS);

P = 20000;
N=length(song);                     % length of song
t=0:1/FS:(N-1)/FS;                  % define time period


song = sum(song,2);                        
song=abs(song);
%windowing = hamming(32768); %Windowing function

% Plot time domain signal
figure(1);
          subplot(2,1,1)
          plot(t,3*song)
          title('Wave File')
          ylabel('Amplitude')
          xlabel('Length (in seconds)')
          %ylim([-1.1 1.1])
          xlim([0 N/FS])

%----------------------Finding the envelope of the signal-----------------%
% Gaussian Filter
x = linspace( -1, 1, P);                      % create a vector of P values between -1 and 1 inclusive
sigma = 0.335;                                % standard deviation used in Gaussian formula
myFilter = -x .* exp( -(x.^2)/(2*sigma.^2));  % compute first derivative, but leave constants out
myFilter = myFilter / sum( abs( myFilter ) ); % normalize

% Plot Gaussian Filter
         subplot(2,1,2)       
         plot(myFilter)
         title('Edge Detection Filter')

% fft convolution
myFilter = myFilter(:);                         % create a column vector
song(length(song)+length(myFilter)-1) = 0;      %zero pad song
myFilter(length(song)) = 0;                     %zero pad myFilter
edges =ifft(fft(song).*fft(myFilter));

tedges=edges(P:N+P-1);                      % shift by P/2 so peaks line up w/ edges
tedges=tedges/max(abs(tedges));                 % normalize

%---------------------------Onset Detection-------------------------------%
% Finding peaks
maxtab = [];
mintab = [];
x = (1:length(tedges));
min1 = Inf;
max1 = -Inf;
min_pos = NaN; 
max_pos = NaN;

lookformax = 1;
for i=1:length(tedges)

    peak = tedges(i:i);
  if peak > max1, 
      max1 = peak;
      max_pos = x(i); 
  end
  if peak < min1, 
      min1 = peak;
      min_pos = x(i); 
  end

  if lookformax
    if peak < max1-0.01
      maxtab = [maxtab ; max_pos max1];
      min1 = peak; 
      min_pos = x(i);
      lookformax = 0;
    end  
  else
    if peak > min1+0.05
      mintab = [mintab ; min_pos min1];
      max1 = peak; 
      max_pos = x(i);
      lookformax = 1;
    end
  end
end
% % Plot song filtered with edge detector          
         figure(2)
         plot(1/FS:1/FS:N/FS,tedges)
         title('Song Filtered With Edge Detector 1')
         xlabel('Time (s)')
         ylabel('Amplitude')
         ylim([-1 1.1])
         xlim([0 N/FS])

         hold on;

         plot(maxtab(:,1)/FS, maxtab(:,2), 'ro')
         plot(mintab(:,1)/FS, mintab(:,2), 'ko')

max_col = maxtab(:,1);
peaks_det = max_col/FS; 
No_of_peaks = length(peaks_det);

song = detrend(song);
%---------------------------Performing FFT--------------------------------%
 for i = 2:No_of_peaks

    song_seg = song(max_col(i-1):max_col(i)-1);
%     song_seg = song(max_col(6):max_col(7)-1);
    L = length(song_seg);    
    NFFT = 2^nextpow2(L); % Next power of 2 from length of y

    seg_fft = fft(song_seg,NFFT);%/L;

    N=5;Fst1=50;Fp1=60; Fp2=1040; Fst2=1050;

%     d = fdesign.bandpass('N,Fst1,Fp1,Fp2,Fst2');
%     h = design(d);
%     seg_fft = filter(h, seg_fft);

%     seg_fft(1) = 0;
%     
    f = FS/2*linspace(0,1,NFFT/2+1);
    seg_fft2 = 2*abs(seg_fft(1:NFFT/2+1));
    L5 = length(song_seg);

    figure(1+i)
    plot(f,seg_fft2)
    title('Frequency spectrum of signal')
    xlabel('Frequency (Hz)')
    %xlim([0 2500])
    ylabel('|Y(f)|')
    ylim([0 300])

    %[B, IX] = sort(seg_fft2)

    %[points loc] = findpeaks(seg_fft);

    %STFT_out(:,i) = seg_fft2;

    %P=max(seg_fft2)
    [points, loc] = findpeaks(seg_fft2,'THRESHOLD',20)
 end

2 个答案:

答案 0 :(得分:2)

如果查看documentation的findpeaks,阈值的含义是:

  

指定峰值与峰值之间的阈值高度差   相邻值为正实数。 findpeaks只返回   超过其邻居的峰值至少达到的值   '阈值'。

因此在行

[points, loc] = findpeaks(seg_fft2,'THRESHOLD',20)

20的值可能太大了。算法没有选择任何最大值,因为峰值最大值应该位于其相邻点之上\ delta(y)= 20的条件会导致它拒绝所有可能的最大值。

您可能需要指定MINPEAKHEIGHT

答案 1 :(得分:0)

如果您尝试查找音符发作的高峰,建议您执行以下步骤,该步骤对尝试在嘈杂的视频中查找哔哔声的发作非常有用。

  1. 提取wavfile.read(audio_wav_path)和对应的wavfile.read(piano_note_wav_path)的信号。
  2. fft通过scipy.signal进行卷积(这会找到音频和音符信号之间的相似区域)
  3. fft将钢琴音符与自身进行卷积(例如,如果音频本身就是钢琴音符,这将找到理想的形状)
  4. 将自卷积(3)与两个(2)的卷积相关联(这发现(2)与(3)紧密匹配的区域,减少了类似音符的假阳性读数)
  5. 分成np个数组和阈值(仅查看真正的高峰),然后进行归一化(以使您处理的数字更易于管理)

完成所有这些操作后,如果您使用scipy.signal的find_peaks,它将为您提供峰值振幅,从中可以找到时间戳。

这对我有用,因为峰值幅度不是一个一致的幅度。 希望对您有帮助!