For a project, I'm working on methods of pitch detection and have settled on the idea of using Harmonic product spectrums. I've been reading up on the theory and as I have no background in music, I'm finding it to be a bit tricky to understand. I'm using processing with the minim library and stumbled across this code online. I understand everything up until the downsampling part. I'm aware that one has to apply a hanning window, then fft, then downsample and multiply by compressing the sample. I know this in theory, but I'm struggling to understand how the code below does it. If possible could someone shed light on how exactly this algorithm implements that so I can adapt it and actually understand what I'm doing. Thanks and I appreciate any feedback.
class PitchDetectorHPS{
FFT fft;
int sampleRate;
int fftLength;
int harmonicSize;
float[][] step;
PitchDetectorHPS(int fftLength,int sampleRate,int harmonicSize){
fft=new FFT(fftLength,sampleRate);
this.sampleRate=sampleRate;
//this.fftLength=fftLength;
this.fftLength=fft.specSize();
this.harmonicSize=harmonicSize;
step=new float[harmonicSize][];
for(int i=0;i<harmonicSize;i++){
step[i]=new float[this.fftLength];
}
//fft.window(fft.HAMMING);
}
float detect(float[] frame){
fft.forward(frame);
//downsample
for(int i=0;i<fftLength;i++){
for(int j=1;j<=harmonicSize;j++){
if(i%j==0)( step[j-1])[i/j]=fft.getBand(i);
}
}
//HSP
int index=0;
float max;
max=0;
float tmp;;
for(int i=0;i<fftLength;i++){
tmp=1;
for(int j=0;j<harmonicSize;j++){
tmp*=step[j][i];
}
//println(tmp);
if(tmp>max){
max=tmp;
index=i;
}
}
if(index==0)return 1.0;
return index*fft.getBandWidth();
}
}
ETA: I've actually tried to implement it but am getting repeated values of 86.13281 when using an electronic song. I'm guessing this isn't right. When using a series of flute notes, I get correct frequencies initIally, but never a frequency above 1K.
ETA: Ok so I've been experimenting some more. I tested it using this video https://www.youtube.com/watch?v=d7lJ_nyCDQA
视频上的频率是 602,689,732,818,904,990,1119,2369
我得到的频率是
613、689、732、807、915。1022、1098,并且不会超出此范围。直到更高的频率,一切都还可以。谁能提供一些见识?