如何使用JTransforms分析和显示来自ExoPlayer的音频样本的频谱?

时间:2018-07-26 08:41:08

标签: android signal-processing

目前,我正在使用JTransforms对我从MediaCodecAudioRenderer复制的音频样本执行FFT,并使用RTMP {{1 }}。

我得到的是4096(或4608 ...是的,有些MP3奇怪地具有非2幂的样本大小,我不知道为什么)长度为MediaSource。那是我必须放入ByteBufferFloatFFT_1D对象中的内容,对吗?

现在我的代码如下:

DoubleFFT_1D

我注释掉了通道拆分代码,因为我不确定是将字节数组批发还是将其按通道拆分,然后将其中一个通道的数据放入。

但是最终的crf.setHook((dupe, format) -> { if(currentMediaSource == mMediaSourceAudio) { byte[] data = new byte[dupe.limit()]; dupe.position(0); dupe.get(data); if(format != null) { new FFTTask(data, format).execute(); } Log.i("straight_from_renderer", data.length+" "+format); } }); ... ... private class FFTTask extends AsyncTask<Void, Void, float[]> { byte[] bufferContents; MediaFormat format; FFTTask(byte[] samples, MediaFormat format) { this.bufferContents = samples; this.format = format; } float[] floatMe(short[] pcms) { float[] floaters = new float[pcms.length]; for (int i = 0; i < pcms.length; i++) { floaters[i] = pcms[i]; } return floaters; } short[] shortMe(byte[] bytes) { short[] out = new short[bytes.length / 2]; // will drop last byte if odd number ByteBuffer bb = ByteBuffer.wrap(bytes); for (int i = 0; i < out.length; i++) { out[i] = bb.getShort(); } return out; } float[] directFloatMe(byte[] bytes) { float[] out = new float[bytes.length / 2]; // will drop last byte if odd number ByteBuffer bb = ByteBuffer.wrap(bytes); for (int i = 0; i < out.length; i++) { out[i] = bb.getFloat(); } return out; } private double db2(double r, double i, double maxSquared) { return 5.0 * Math.log10((r * r + i * i) / maxSquared); } double[] convertToDb(double[] data, double maxSquared) { data[0] = db2(data[0], 0.0, maxSquared); int j = 1; for (int i=1; i < data.length - 1; i+=2, j++) { data[j] = db2(data[i], data[i+1], maxSquared); } data[j] = data[0]; return data; } @Override protected float[] doInBackground(Void... voids) { //WARNING: bufferContents is from a 2-channel 48k bitrate audio, so convert to mono first? /* byte[] oneChannel = new byte[bufferContents.length/2]; for(int i = 0; i < oneChannel.length; i+=2) { oneChannel[i] = bufferContents[i*2+2]; oneChannel[i+1] = bufferContents[i*2+3]; } */ float[] dataAsFloats = floatMe(shortMe(bufferContents)); int fftLen = dataAsFloats.length/2; fft = new FloatFFT_1D(fftLen); fft.complexForward(dataAsFloats); String log = ""; float[] magnitudes = new float[dataAsFloats.length/2]; float magMax = 0; int maxIndex = 0; float dominantFreq = 0; for(int i = 0; i < dataAsFloats.length/2; i++) { float re = dataAsFloats[2*i]; float im = dataAsFloats[2*i+1]; magnitudes[i] = (float)(Math.sqrt(re * re + im * im) / 1e7); //log += re+" "+im+" "+magnitudes[i]+"\n"; if(magnitudes[i] > magMax) { magMax = (float)(magnitudes[i]); maxIndex = i; } } dominantFreq = format.getInteger(MediaFormat.KEY_SAMPLE_RATE) * maxIndex / fftLen; Log.i("fft_results", magMax+" "+dominantFreq); return magnitudes; } @Override protected void onPostExecute(float[] res) { super.onPostExecute(res); //fftListener.onFFTResultsAvailable(res); caView.feedFFTMagnitudes(res); } } 值确实很嘈杂-幅值混乱,并且没有显示出通常的音频分析图像中应该出现的明显模式-而是紧密的曲折,高于20000 Hz的频率值。

我在做什么错了?

1 个答案:

答案 0 :(得分:0)

事实证明,我做错的是字节到短的转换。现在,我只是使用此函数将两个字节转换为短字节:

private short getSixteenBitSample(byte high, byte low) {
    return (short)((high << 8) | (low & 0xff));
}

然后将短路转换为转换数组中的浮点数。然后将转换数组通过FFT。