Question

我一直致力于在现有的音乐软件项目中实现用于实时音频捕获和分析的系统。该系统的目标是在用户按下记录按钮时（或在指定的计数期间之后）开始捕获音频，确定用户唱歌或播放的音符，并在音乐人员上记录这些音符。我的方法的要点是使用一个线程来捕获音频数据块并将它们放入队列，另一个线程用于从队列中删除数据并执行分析。

这种方案运行良好，但我无法量化音频捕获开始和MIDI支持乐器播放之间的延迟。音乐捕捉在MIDI乐器开始播放之前开始，并且用户可能会将他或她的演奏与MIDI乐器同步。因此，我需要忽略在支持MIDI乐器开始播放之前捕获的音频数据，并且仅分析在该点之后收集的音频数据。

回放曲目的回放由一段代码处理，这些代码已经存在了很长一段时间并由其他人维护，所以我想尽可能避免重构整个程序。音频捕获由Timer对象和扩展TimerTask的类控制，其实例是在名为Notate的笨重（~25k行）类中创建的。顺便提一下，Notate还会保留处理背景音轨回放的对象。 Timer的.scheduleAtFixedRate（）方法用于控制音频捕获的周期，TimerTask通过调用队列上的.notify（）（ArrayBlockingQueue）来通知捕获线程开始。

我计算这两个进程初始化之间的时间间隔的策略是从播放开始时的时间戳中减去捕获开始之前的时间戳（以毫秒为单位），我将其定义为.start时的时间戳。（）方法在负责MIDI背景轨道的Java Sequencer对象上调用。然后我使用结果来确定我期望在此间隔（n）期间捕获的音频样本的数量，并忽略捕获的音频数据阵列中的前n * 2个字节（n * 2，因为我正在捕获16-位样本，而数据存储为字节数组...每个样本2个字节）。

但是，这种方法并没有给我准确的结果。计算的偏移总是小于我预期的偏移量，使得在指定位置开始分析之后音频数据中仍存在非平凡（并且不幸地变化）量的“空”空间。这导致程序尝试分析当用户尚未开始与支持MIDI乐器一起演奏时收集的音频数据，在用户的音乐段落的乞讨中有效地添加休止符 - 没有音符 - 并破坏节奏值计算所有后续票据。

下面是我的音频捕获线程的代码，它还确定捕获的音频数据阵列的延迟和相应的位置偏移。任何人都可以深入了解我的确定延迟的方法无法正常工作的原因吗？

public class CaptureThread extends Thread
{
    public void run()
    {
        //number of bytes to capture before putting data in the queue.
    //determined via the sample rate, tempo, and # of "beats" in 1 "measure"
        int bytesToCapture = (int) ((SAMPLE_RATE * 2.) / (score.getTempo()
                / score.getMetre()[0] / 60.));
    //temporary buffer - will be added to ByteArrayOutputStream upon filling.
        byte tempBuffer[] = new byte[target.getBufferSize() / 5];

        int limit = (int) (bytesToCapture / tempBuffer.length);

        ByteArrayOutputStream outputStream = new ByteArrayOutputStream(bytesToCapture);
        int bytesRead;

        try
        { //Loop until stopCapture is set.
            while (!stopCapture)
            { //first, wait for notification from TimerTask
                synchronized (thisCapture)
                {
                    thisCapture.wait();
                }

                if (!processingStarted)
                { //the time at which audio capture begins
                    startTime = System.currentTimeMillis();
                }

                //start the TargetDataLine, from which audio data is read
                target.start();

                //collect 1 captureInterval's worth of data
                for (int n = 0; n < limit; n++)
                {
                    bytesRead = target.read(tempBuffer, 0, tempBuffer.length);
                    if (bytesRead > 0)
                    {   //Append data to output stream.
                        outputStream.write(tempBuffer, 0, bytesRead);
                    }
                }

                if (!processingStarted)
                {
                    long difference = (midiSynth.getPlaybackStartTime()
                            + score.getCountInTime() * 1000 - startTime);

                    positionOffset = (int) ((difference / 1000.)
                            * SAMPLE_RATE * 2.);

                    if (positionOffset % 2 != 0)
                    { //1 sample = 2 bytes, so positionOffset must be even
                        positionOffset += 1;
                    }
                }
                if (outputStream.size() > 0)
                {   //package data collected in the output stream into a byte array
                    byte[] capturedAudioData = outputStream.toByteArray();
                    //add captured data to the queue for processing
                    processingQueue.add(capturedAudioData);

                    synchronized (processingQueue)
                    {
                        try
                        { //notify the analysis thread that data is in the queue
                            processingQueue.notify();
                        } catch (Exception e)
                        {
                            //handle the error
                        }
                    }

                    outputStream.reset(); //reset the output stream
                }
            }
        } catch (Exception e)
        {
            //handle error
        }
    }
}

我正在研究使用Mixer对象来同步接受麦克风数据的TargetDataLine和处理MIDI乐器播放的Line。现在找到处理播放的Line ......有什么想法吗？

Answer 1

Google有一个很好的开源应用程序叫做AudioBufferSize，你可能很熟悉。我修改了这个应用程序的测试单向延迟 - 也就是说，用户按下按钮和音频API播放声音之间的时间。这是我添加到AudioBufferSize以实现此目的的代码。您是否可以使用这种方法在事件和用户感知事件之间提供时序增量？

final Button latencyButton = (Button) findViewById(R.id.latencyButton);
latencyButton.setOnClickListener(new OnClickListener() {
    public void onClick(View v) {
        mLatencyStartTime = getCurrentTime();
        latencyButton.setEnabled(false);

        // Do the latency calculation, play a 440 hz sound for 250 msec
        AudioTrack sound = generateTone(440, 250);              
        sound.setNotificationMarkerPosition(count /2); // Listen for the end of the sample

        sound.setPlaybackPositionUpdateListener(new OnPlaybackPositionUpdateListener() {
            public void onPeriodicNotification(AudioTrack sound) { }
            public void onMarkerReached(AudioTrack sound) {
                // The sound has finished playing, so record the time
                mLatencyStopTime = getCurrentTime();
                diff = mLatencyStopTime - mLatencyStartTime;
                // Update the latency result
                TextView lat = (TextView)findViewById(R.id.latency);
                lat.setText(diff + " ms");
                latencyButton.setEnabled(true);
                logUI("Latency test result= " + diff + " ms");
            }
        });
        sound.play();
    }
});

有一个对generateTone的引用看起来像这样：

private AudioTrack generateTone(double freqHz, int durationMs) {
    int count = (int)(44100.0 * 2.0 * (durationMs / 1000.0)) & ~1;
    short[] samples = new short[count];
    for(int i = 0; i < count; i += 2){
        short sample = (short)(Math.sin(2 * Math.PI * i / (44100.0 / freqHz)) * 0x7FFF);
        samples[i + 0] = sample;
        samples[i + 1] = sample;
    }
    AudioTrack track = new AudioTrack(AudioManager.STREAM_MUSIC, 44100,
    AudioFormat.CHANNEL_OUT_STEREO, AudioFormat.ENCODING_PCM_16BIT,
    count * (Short.SIZE / 8), AudioTrack.MODE_STATIC);
    track.write(samples, 0, count);
    return track;
}

刚才意识到，这个问题已有多年历史了。对不起，也许有人会发现它很有用。

确定音频处理中的延迟

1 个答案: