我们的图表获取声音文件,将其划分为多个帧,并使用零交叉速率和短期能量找到浊音帧。在某个时候,算法会找到有声帧的所有ID。我们想为有声帧创建良好的图形表示,但我们没有使用图在原始数据上标记有声帧。 现在,我们仅能成功显示带语音的帧而没有清音(在图的开头):
但是我们想做这样的事情(用图形编辑器制作),这样您就可以在原始数据图上看到有声帧):
We want image to look like this
代码:
close all; clear all;
% read sound
[data, fs] = audioread('shee_mono.wav');
% normalize data
data = data / abs(max(data));
f_d = 0.025
%[frames, ~] = vec2frames( data, Nw, Ns, 'rows', @hamming, false);
frames = framing(data, fs, f_d);
ZCR_values_per_frame = ZCR(frames, f_d, fs, data);
f_energy_vector = STECalc(frames);
ste_threshold = 0.01;
zcr_threshold = mean(ZCR_values_per_frame); %take average ZCR as threshold
voiced_id = find_voiced_id(ZCR_values_per_frame, f_energy_vector, zcr_threshold, ste_threshold);
unvoiced_id = reshape(1:size(frames), 1, []); %create vector filled with numbers 1...96 in order
unvoiced_id = setdiff(unvoiced_id, voiced_id); %change vector to be every frame that is unvoiced
fr_unvoiced = frames(unvoiced_id,:);
data_unvoiced = reshape(fr_unvoiced',1,[]);
fr_voiced = frames(voiced_id,:);
data_voiced = reshape(fr_voiced',1,[]);
figure
plot(data); hold on;
%plot(data_unvoiced, 'b');
%plot(data_voiced, 'g');
sound(data_voiced, fs);
title ("Blue - original data, green - voiced areas after unvoiced deleted");
[ voiced_timing, unvoiced_timing ] = return_voiced_unvoiced_timings(voiced_id, unvoiced_id, f_d, frames);
P.S。抱歉,有一些错误。英语不是我的母语
答案 0 :(得分:0)
您可以根据样本频率和样本数量制作一个时间向量tv
:
tv = (0:numel(data)-1)/fs;
然后,您可以使用voiced_id
(包含要获取的声音片段的索引,查看您的Github存储库)来获取与所需数据时间戳相对应的时间向量:
tv_voice = tv(voiced_id);
然后使用时间向量作为x值进行绘图:
plot(tv,data, 'b');
plot(tv_voice, data_voiced, 'g');