我正试图将paulstretch的信号移到matlab / octave上。 https://github.com/paulnasca/paulstretch_python
请参阅下面的工作流程
我可以将信号分成频率,幅度和相位,然后使用下面的代码将它们连接起来。 我在浏览窗口段,重叠和扩展信号方面遇到了问题。有什么想法吗?
示例matlab / octave代码:
freq=[0;0.534974;1.06995;1.60492;2.1399]
amp1=[3.94414e-19;1.20523e-05;5.08643e-06;4.22469e-05;3.04322e-05]
phase=[0;0.0546221;-1.11534;-2.4926;-2.55601]
a1=[freq,amp1,phase];
t_rebuilt=linspace(0,2*pi,8000);
sigcomb=zeros(1,length(t_rebuilt));
kk=0
for kk=1:1:length(freq) %rebuild signal from collected freq,amplitudes,and phases
sigcomb=sigcomb+a1(kk,2)*cos((a1(kk,1))*t_rebuilt+(a1(kk,3)));
end
normalize=(sigcomb/max(abs(sigcomb))*.8);
wavwrite([normalize'] ,8000,16,strcat('/tmp/test.wav'));
PS:这只是测试数据,为了获得音频信号,我将不得不使用更多的数据点,这会使问题变得混乱。
我的想法是使用for循环来创建新信号的1秒wav文件,无论文件延伸多长时间,因为这将防止更大持续时间文件的阵列大小内存问题。然后使用像sox这样的其他程序将它们连接在一起。我已经知道了。
PS:我使用八度音阶3.8.1,假设与matlab兼容
答案 0 :(得分:3)
我试图构建一个没有循环的版本并使用来自的original.ogg http://hypermammut.sourceforge.net/paulstretch/。我认为这是速度和内存大小之间的权衡(如果输入文件很长,下面的版本可能会占用大量内存)
[d, fs, bps] = wavread ("original.wav");
printf ("Input duration = %.2f s\n", rows (d)/fs);
stretch = 8;
windowsize = round (0.25 * fs);
step = round ((windowsize/2)/stretch);
## original window
fwin = @(x) (1-x.^2).^1.25;
win = fwin (linspace (-1, 1, windowsize));
#win = hanning (windowsize)';
## build index
ind = (bsxfun (@plus, 1:windowsize, (0:step:(rows(d)-windowsize))'))';
cols_ind = columns(ind);
## Only use left channel
left_seg = d(:,1)(ind);
clear d ind;
## Apply window
left_seg = bsxfun (@times, left_seg, win');
## FFT
fft_left_seg = fft (left_seg);
clear left_seg
#keyboard
## overwrite phases with random phases
fft_rand_phase_left = fft_left_seg.*exp(i*2*pi*rand(size(fft_left_seg)));
clear fft_left_seg;
ifft_left = ifft (fft_rand_phase_left);
clear fft_rand_phase_left;
## window again
ifft_left = bsxfun (@times, real(ifft_left), win');
## restore the windowed segments with half windowsize shift
restore_step = floor(windowsize/2);
ind2 = (bsxfun (@plus, 1:windowsize, (0:restore_step:(restore_step*(cols_ind-1)))'))';
left_stretched = sparse (ind2(:), repmat(1:columns (ind2), rows(ind2), 1)(:), real(ifft_left(:)), ind2(end, end), cols_ind);
clear ind2 ifft_left win;
left_stretched = full (sum (left_stretched, 2));
## normalize
left_stretched = 0.8 * left_stretched./max(left_stretched);
printf ("Output duration = %.2f s\n", rows (left_stretched)/fs);
wavwrite (left_stretched, fs, bps, "streched.wav");
system("aplay streched.wav")