绘制.wav文件的音量 - 时间图

时间:2015-07-13 10:41:09

标签: matlab signal-processing audio-processing

我正在尝试获取.wav文件的音量时间图。首先,我通过android作为.wav文件录制声音(患者呼气),但是当我在.wav中读取此MATLAB文件时,它具有负值。负值是什么意思?其次,MATLAB专家可以请您检查下面的代码是否与我的评论中写的相同?另一个问题。 Y = fft(WindowArray); p = abs(Y).^2;  我接受了从fft返回的值的力量......这是正确的,这一步的目标是什么?

[data, fs] =  wavread('newF2');
% read exhalation audio wav  file (1 channel, mono)
% frequency is 44100 HZ
% windows of 0.1 s and overlap of 0.05 seconds
WINDOW_SIZE = fs*0.1; %4410 = fs*0.1
array_size = length(data); % array size of data
numOfPeaks = (array_size/(WINDOW_SIZE/2)) - 1;
step = floor(WINDOW_SIZE/2); %step size used in loop
transformed = data;
start =1;
k = 1;
t = 1;
g = 1;
o = 1;
% performing fft on each window and finding the peak of windows 
while(((start+WINDOW_SIZE)-1)<=array_size) 
    j=1;
    i =start;
    while(j<=WINDOW_SIZE)
        WindowArray(j) = transformed(i);
        j = j+1;
        i = i +1;
    end
    Y = fft(WindowArray);
    p = abs(Y).^2; %power
      [a, b] = max(abs(Y)); % find max a and its indices b
      [m, i] = max(p); %the maximum of the power m and its indices i
      maximum(g) = m;
      index(t) = i;
      power(o) = a;
      indexP(g) = b;
      start = start + step;
      k = k+1;
      t = t+1;
      g = g+1;
      o=o+1;  
  end
% low pass filter 
% filtering noise: ignor frequencies that are less than 5% of maximum frequency
for u=1:length(maximum)
    M = max(maximum); %highest value in the array
    Accept = 0.05* M;
    if(maximum(u) > Accept)
        maximum = maximum(u:length(maximum));
        break;
    end
end
% preparing the time of the graph, 
% Location of the Peak flow rates are estimated
TotalTime = (numOfPeaks * 0.1);
time1 = [0:0.1:TotalTime];
if(length(maximum) > ceil(numOfPeaks));
maximum = maximum(1:ceil(numOfPeaks)); 
end
time = time1(1:length(maximum));
% plotting frequency-time graph
figure(1);
plot(time, maximum);
ylabel('Frequency');
xlabel('Time (in seconds)');
% plotting volume-time graph
figure(2);
plot(time, cumsum(maximum)); % integration over time to get volume 
ylabel('Volume');
xlabel('Time (in seconds)');

2 个答案:

答案 0 :(得分:0)

(我只回答我理解的问题部分)

默认情况下,Matlab将音频波规范化为-1 ... 1范围。如果需要整数数据,请使用native选项。

答案 1 :(得分:0)

First, in your code it should be p = abs(Y)**2, this is the proper way to square the values returned from the FFT. The reason why you take the absolute value of the FFT return values is because those number are complex numbers with a Real and Imaginary part, therefore the absolute value (or modulus) of an imaginary number is the magnitude of that number. The goal of taking the power could be for potentially obtaining an RMS value (root mean squared) of your overall amplitude values, but you could also have something else in mind. When you say volume-time I assume you want decibels, so try something like this:

def plot_signal(file_name):

sampFreq, snd = wavfile.read(file_name)

snd = snd / (2.**15) #Convert sound array to floating point values 
                     #Floating point values range from -1 to 1

s1 = snd[:,0] #left channel

s2 = snd[:,1] #right channel

timeArray = arange(0, len(snd), 1)
timeArray = timeArray / sampFreq
timeArray = timeArray * 1000  #scale to milliseconds

timeArray2 = arange(0, len(snd), 1)
timeArray2 = timeArray2 / sampFreq
timeArray2 = timeArray2 * 1000  #scale to milliseconds

n = len(s1)
p = fft(s1) # take the fourier transform 

m = len(s2) 
p2 = fft(s2)

nUniquePts = ceil((n+1)/2.0)
p = p[0:nUniquePts]
p = abs(p)

mUniquePts = ceil((m+1)/2.0)
p2 = p2[0:mUniquePts]
p2 = abs(p2)

'''
Left Channel
'''
p = p / float(n) # scale by the number of points so that
             # the magnitude does not depend on the length 
             # of the signal or on its sampling frequency  
p = p**2  # square it to get the power 




# multiply by two (see technical document for details)
# odd nfft excludes Nyquist point
if n % 2 > 0: # we've got odd number of points fft
    p[1:len(p)] = p[1:len(p)] * 2
else:
    p[1:len(p) -1] = p[1:len(p) - 1] * 2 # we've got even number of points fft

plt.plot(timeArray, 10*log10(p), color='k')
plt.xlabel('Time (ms)')
plt.ylabel('LeftChannel_Power (dB)')
plt.show()

'''
Right Channel
'''
p2 = p2 / float(m) # scale by the number of points so that
             # the magnitude does not depend on the length 
             # of the signal or on its sampling frequency  
p2 = p2**2  # square it to get the power 




# multiply by two (see technical document for details)
# odd nfft excludes Nyquist point
if m % 2 > 0: # we've got odd number of points fft
    p2[1:len(p2)] = p2[1:len(p2)] * 2
else:
    p2[1:len(p2) -1] = p2[1:len(p2) - 1] * 2 # we've got even number of points fft


plt.plot(timeArray2, 10*log10(p2), color='k')
plt.xlabel('Time (ms)')
plt.ylabel('RightChannel_Power (dB)')
plt.show()

I hope this helps.