问题:计算的音高与预期的音高不匹配。例如,输出约为。 'D3',但是预期输出是'C5'。
library("tuneR")
library("seewave")
#0: Acquisition of sample sound
snd_smpl = readWave(paste("~/Music/sample/1980s-Casio-Celesta-C5.wav"),
from = 0, to = 1, units = "seconds")
dur_smpl = duration(snd_smpl)
len_smpl = length(snd_smpl)
#1 : Pre-Processing Stage
#1.1 : Application of Hanning Window
n = 1:len_smpl
han_win = 0.5-0.5*cos(2*pi*n/(len_smpl-1))
wind_sig = han_win*snd_smpl@left
#2.1 : Auto-Correlation Calculation
rev_wind_sig = rev(wind_sig) #Reversing the windowed signal
acorr_1 = convolve(wind_sig, rev_wind_sig, type = "open")
# Obtaining the 2nd half of the correlation, to simplify calculation
n = 2*len_smpl-1
acorr_2 = (1/len_smpl)*acorr_1[len_smpl:n]
#2.2 : Note Calculation
min_index = which.min(acorr_2)
print(min_index)
fs = 44100
fo = fs/min_index #To obtain fundamental frequency
print(fo)
print(notenames(noteFromFF(fo)))
> print(min_index)
[1] 37
> fs = 44100
> fo = fs/min_index
> print(fo)
[1] 1191.892
> print(notenames(noteFromFF(fo)))
[1] "d'''"
整个计算在时域中进行。 我目前以自相关为基础,以了解有关音高检测和分析的更多信息。我试图用“ Audacity”分析样本,结果是“ C5”。因此,我想知道问题出在哪里。 大家都可以帮我找到它吗?
还有一些但重要的疑问:
答案 0 :(得分:1)
整个分析似乎不正确。您不应该在时域分析中使用加窗。
以python语言提供了一个简短的解决方案;您可以将其用作伪代码
from soundfile import read
from glob import glob
from scipy.signal import correlate, find_peaks
from matplotlib.pyplot import plot, show, xlim, title, xlabel
import numpy as np
%matplotlib inline
name = glob('*wav')[0]
samples, fs = read(name)
corr = correlate(samples, samples)
corr = corr[corr.size / 2:]
time = np.arange(corr.size) / float(fs)
ind = find_peaks(corr[time < 0.002])[0]
plot(time, corr)
plot(time[ind], corr[ind], '*')
xlim([0, 0.005])
title('Frequency = {} Hz'.format(1 / time[ind][0]))
xlabel('Time [Sec]')
show()