Question

所以，我一直在努力寻找最佳解决方案来计算实时使用AudioRecord捕获的样本的基频。我在这里看了一些关于SO的例子： This one， and this one是最能帮助我的问题，但我仍然不完全理解他们如何找到基本频率。所以我正在寻找的是更详细的解释，我需要做些什么才能找到有样本的基频。

所以，我创建了一个AudioRecord：

micData = new AudioRecord(audioSource, sampleRate, channel, encoding, bufferSize);
data = new short[bufferSize];

开始听：

micData.startRecording();    
sample = micData.read(data,0,bufferSize);

我理解如何创建一个复杂的数组，但我不确切知道FFT.java中的哪些方法我可以使用这些值来创建这些复数，而哪一个将是返回的方法峰值频率。

Answer 1

阅读您的问题我发现您还不确定是否要使用FFT。这很好，因为我不建议只使用FFT。保持时间域，使用自相关或AMDF，如果您想要更准确的结果，请使用FFT作为附加组件。

这是我用于计算基频的Java代码。我写评论是因为你说你还是不了解这个过程。

public double getPitchInSampleRange(AudioSamples as, int start, int end) throws Exception {
    //If your sound is musical note/voice you need to limit the results because it wouldn't be above 4500Hz or bellow 20Hz
    int nLowPeriodInSamples = (int) as.getSamplingRate() / 4500;
    int nHiPeriodInSamples = (int) as.getSamplingRate() / 20;

    //I get my sample values from my AudioSamples class. You can get them from wherever you want
    double[] samples = Arrays.copyOfRange((as.getSamplesChannelSegregated()[0]), start, end);
    if(samples.length < nHiPeriodInSamples) throw new Exception("Not enough samples");

    //Since we're looking the periodicity in samples, in our case it won't be more than the difference in sample numbers
    double[] results = new double[nHiPeriodInSamples - nLowPeriodInSamples];

    //Now you iterate the time lag
    for(int period = nLowPeriodInSamples; period < nHiPeriodInSamples; period++) {
        double sum = 0;
        //Autocorrelation is multiplication of the original and time lagged signal values
        for(int i = 0; i < samples.length - period; i++) {
            sum += samples[i]*samples[i + period];
        }
        //find the average value of the sum
        double mean = sum / (double)samples.length;
        //and put it into results as a value for some time lag. 
        //You subtract the nLowPeriodInSamples for the index to start from 0.
        results[period - nLowPeriodInSamples] = mean;
    }
    //Now, it is obvious that the mean will be highest for time lag equal to the periodicity of the signal because in that case
    //most of the positive values will be multiplied with other positive and most of the negative values will be multiplied with other
    //negative resulting again as positive numbers and the sum will be high positive number. For example, in the other case, for let's say half period
    //autocorrelation will multiply negative with positive values resulting as negatives and you will get low value for the sum.        
    double fBestValue = Double.MIN_VALUE;
    int nBestIndex = -1; //the index is the time lag
    //So
    //The autocorrelation is highest at the periodicity of the signal
    //The periodicity of the signal can be transformed to frequency
    for(int i = 0; i < results.length; i++) {
        if(results[i] > fBestValue) {
            nBestIndex = i; 
            fBestValue = results[i]; 
        }
    }
    //Convert the period in samples to frequency and you got yourself a fundamental frequency of a sound
    double res = as.getSamplingRate() / (nBestIndex + nLowPeriodInSamples)

    return res;
}

您还需要知道的是，自相关方法中存在常见的八度音程错误，尤其是在信号中存在噪声的情况下。根据我的经验，钢琴声或吉他不是问题。错误很少见。但是人的声音可能是......

Android：查找音频输入的基本频率

1 个答案: