Question

我正在尝试进行音频指纹识别，第一步是读取音频并将其提供给FFT算法。

我使用javax.audio.sampled包读取和转换数据，我读取的是带符号的PCM wave，其值从文件开头的-128开始，然后是-127，-126， -125当波浪下降时，或127,126,125，...当波浪上升时。

这是正确的吗？

这是读取数据并将其馈送到FFT的代码：

public static AudioInputStream getAudioDataBytes(String filename) throws IOException, UnsupportedAudioFileException {
        File file = new File(filename);
        AudioInputStream in= AudioSystem.getAudioInputStream(file);
        AudioFormat baseFormat = in.getFormat();
        AudioFormat convertFormat = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 
                baseFormat.getSampleRate(), 16, 
                baseFormat.getChannels(), 
                baseFormat.getChannels() * 2,
                baseFormat.getSampleRate(),
                false);
        AudioInputStream din = AudioSystem.getAudioInputStream(convertFormat, in);
        //AudioFormat reconvertFormat = new AudioFormat(AudioFormat.Encoding.PCM_UNSIGNED, 11025, 8, 1, 2, 11025, false);
        AudioFormat reconvertFormat = new AudioFormat(44100, 8, 1, true, false);
        AudioInputStream din2 = AudioSystem.getAudioInputStream(reconvertFormat, din);
        System.out.println("Conversion supported:" + AudioSystem.isConversionSupported(convertFormat, reconvertFormat));
        AudioSystem.write(din2, Type.WAVE, new File(filename + ".wav"));
        din2.close();
        din.close();
        in.close();

        return AudioSystem.getAudioInputStream(new File(filename + ".wav"));
    }

然后：

AudioInputStream ais = MP3Converter.getAudioDataBytes(inputFile);
int available = ais.available();
byte[] resultsdata = new byte[available];
System.out.println("Read whole file=" + (ais.read(resultsdata) == available));
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
IOUtils.write(resultsdata, outputStream);
Complex[][] results = new DataProcessor().makeFFT(outputStream);

之后的某个时刻：

public Complex[][] makeFFT(ByteArrayOutputStream out) {
        byte audio[] = out.toByteArray();

        final int totalSize = audio.length;

        int amountPossible = totalSize/CHUNK_SIZE;

        //When turning into frequency domain we'll need complex numbers: 
        Complex[][] results = new Complex[amountPossible][];

        //For all the chunks: 
        for(int times = 0;times < amountPossible; times++) {
            Complex[] complex = new Complex[CHUNK_SIZE];
            for(int i = 0;i < CHUNK_SIZE;i++) {
                //Put the time domain data into a complex number with imaginary part as 0: 
                complex[i] = new Complex(audio[(times*CHUNK_SIZE)+i], 0);
            }
            //Perform FFT analysis on the chunk: 
            results[times] = FFT.fft(complex);
            System.out.println(Arrays.toString(complex));
        }
        return results;
    }

可以向FFT算法馈送此有符号数据还是需要无符号数据？（可在https://introcs.cs.princeton.edu/java/97data/FFT.java.html找到FFT算法）

耐力是一个因素吗？

FFT算法的预期输入格式

0 个答案: