MFCC特征提取结果矩阵能否具有负值?

时间:2014-04-02 11:51:05

标签: speech-recognition speech mfcc

我正在使用MFCC提取功能来实现语音识别器我坚持使用HMM实现。我正在使用Kevin Murphy Toolbox进行HMM。我的MFCC结果矩阵包含负值,这可能是我得到的情况,我的MFCC代码是错误的。以下是我得到的错误 -

Attempted to access obsmat(:,-39.5403); index must be a positive integer or logical.

Error in multinomial_prob (line 19)
  B(:,t) = obsmat(:, data(t));

Error in dhmm_em>compute_ess_dhmm (line 103)
 obslik = multinomial_prob(obs, obsmat);

Error in dhmm_em (line 47)
 [loglik, exp_num_trans, exp_num_visits1, exp_num_emit] = ...

Error in speechreco (line 77)
[LL, prior2, transmat2, obsmat2] = dhmm_em(dtr{1}, prior, A, B, 'max_iter', 5);

如果有人知道任何关于HMM的Matlab源代码的链接,请提供我的最终项目。我正在尝试实现语音识别器,并且在提取特征向量后不知道该怎么做。

这是整个MatLab代码(我使用的是kevin murphy HMM Toolkit,错误在dhmm_em函数中):

    function []=speechreco()

vtr = {8}; fstr = {8}; nbtr = {8};
ctr = {8};

for i = 1:8

    % Read audio data from train folder for performing operations
    st=strcat('train\s',num2str(i),'.wav');
    [s1 , fs1 , nb1]=wavread(st);  %st is filename; s1 is sample data, fs1 is frame rate in hertz, nb1 is number of bits per sample 
    vtr{i} = s1; fstr{i} = fs1; nbtr{i} = nb1;

    ctr{i} = mfcc(vtr{i},fstr{i});

end


display(ctr{1}); %MFCC matrix 20*129

W1 = transpose(ctr{1});

ch1=menu('Mel Space:','Signal 1','Signal 2','Signal 3',...
                        'Signal 4','Signal 5','Signal 6','Signal 7','Signal 8','Exit');
                    if ch1~=9
                        plot(linspace(0, (fstr{ch1}/2), 129), (melfb(20, 256, fstr{ch1})));
                        title('Mel-Spaced-Filterbank');
                        xlabel('Frequency[Hz]');
                    end


%error is here
[LL, prior2, transmat2, obsmat2] = dhmm_em(ctr{1}, prior, A, B, 'max_iter', 5);
plot(LL());

end

%%mfcc
%old one MFCC now
function r = mfcc(s, fs)
m = 100;
n = 256;
frame=blockFrames(s, fs, m, n); %power spectra obtained 
m = melfb(20, n, fs);
n2 = 1 + floor(n / 2);
z = m * abs(frame(1:n2, :)).^2; %apply traingular window
r = dct(log(z));  %take log and then the dct conversion 
end



%% blockFrames Function
% blockFrames: Puts the signal into frames
%
% Inputs: s contains the signal to analize
% fs is the sampling rate of the signal
% m is the distance between the beginnings of two frames
% n is the number of samples per frame
%
% Output: M3 is a matrix containing all the frames

function M3 = blockFrames(s, fs, m, n)
l = length(s);
nbFrame = floor((l - n) / m) + 1;
for i = 1:n
    for j = 1:nbFrame
        M(i, j) = s(((j - 1) * m) + i); %#ok<AGROW>
    end
end
h = hamming(n);
M2 = diag(h) * M;
for i = 1:nbFrame
    M3(:, i) = fft(M2(:, i)); %#ok<AGROW>
end
end
%--------------------------------------------------------------------------

function m = melfb(p, n, fs)  %used for graph plot of power spectra
% MELFB Determine matrix for a mel-spaced filterbank 
% 
% Inputs: p number of filters in filterbank 
% n length of fft 
% fs sample rate in Hz 
% 
% Outputs: x a (sparse) matrix containing the filterbank amplitudes 
% size(x) = [p, 1+floor(n/2)] 
% 
% Usage: For example, to compute the mel-scale spectrum of a 
% colum-vector signal s, with length n and sample rate fs: 
% 
% f = fft(s); 
% m = melfb(p, n, fs); 
% n2 = 1 + floor(n/2); 
% z = m * abs(f(1:n2)).^2; 
% 
% z would contain p samples of the desired mel-scale spectrum 
%%%%%%%%%%%%%%%%%% 
%
f0 = 700 / fs; 
fn2 = floor(n/2); 
lr = log(1 + 0.5/f0) / (p+1); 
% convert to fft bin numbers with 0 for DC term 
bl = n * (f0 * (exp([0 1 p p+1] * lr) - 1)); 
b1 = floor(bl(1)) + 1; 
b2 = ceil(bl(2)); 
b3 = floor(bl(3)); 
b4 = min(fn2, ceil(bl(4))) - 1; 
pf = log(1 + (b1:b4)/n/f0) / lr; 
fp = floor(pf); 
pm = pf - fp; 
r = [fp(b2:b4) 1+fp(1:b3)]; 
c = [b2:b4 1:b3] + 1; 
v = 2 * [1-pm(b2:b4) pm(1:b3)]; 
m = sparse(r, c, v, p, 1+fn2); 
end
%----------------------------------------------------------------------

1 个答案:

答案 0 :(得分:2)

错误与MFCC中的负值无关,值可能为负值。错误说索引是在障碍物中的浮点值,这意味着您错误地构造了障碍物,它的类型错误,并且您的值和索引位于错误的位置。您需要共享您编写的整个代码以显示错误,而不仅仅是您调用hmm培训的行。

查看您的代码,我发现您可能需要使用ctr调用dhmm_em,而不是使用ctr {1}。