在Ubuntu 16.04.1 amd64上使用带有Qt 5.5.1框架的C ++录制声音时,面对奇怪的ALSA监听设备行为。从监视器设备读取的字节数(重放发送到ALSA输出设备的声音的输入设备)与在相同时间段内从麦克风读取的字节数大不相同。例如,当使用S16LE样本以44100的采样率记录仅5秒时,监视器读取字节数比从麦克风读取的数量多31760个字节。如果我真的玩了一些东西 - 差异更大,最多648160字节。
我编写了演示程序,尽可能简约(source code archive - 只需简短的main.cpp和简单的类)。它运行三个测试 - 一个是扬声器没有播放,第二个是我使用tada.wav
播放很多短aplay
个声音,第三个是我播放几个这样的声音。
以下是该程序的输出:
===> TEST MONITOR SILENCE
Microphone audio input device: "alsa_input.pci-0000_00_1b.0.analog-stereo"
Monitor audio input device: "alsa_output.pci-0000_00_1b.0.analog-stereo.monitor"
Bytes read from microphone: 421040
Bytes read from monitor: 452800
Difference between bytes read from microphone and monitor: 31760
5.003 seconds elapsed
===> TEST MONITOR LOT OF TADA
Microphone audio input device: "alsa_input.pci-0000_00_1b.0.analog-stereo"
Monitor audio input device: "alsa_output.pci-0000_00_1b.0.analog-stereo.monitor"
Bytes read from microphone: 435840
Bytes read from monitor: 1084000
Difference between bytes read from microphone and monitor: 648160
5.002 seconds elapsed
===> TEST MONITOR FEW TADA
Microphone audio input device: "alsa_input.pci-0000_00_1b.0.analog-stereo"
Monitor audio input device: "alsa_output.pci-0000_00_1b.0.analog-stereo.monitor"
Bytes read from microphone: 435520
Bytes read from monitor: 559702
Difference between bytes read from microphone and monitor: 124182
5.001 seconds elapsed
----
Difference ratio (lot of tada / silence): 20.4081
Difference ratio (one tada / silence): 3.91001
如您所见,如果我简单地播放少量短音,则显示器/麦克风字节数之间的差异会增加~4倍,而当播放很多声音时甚至会增加20(!)次。无论我是从Qt程序运行声音播放命令还是手动使用终端都无关紧要。使用Qt 5.6.1在Ubuntu 14.04上进行了测试 - 同样的事情。试图在enter link description here的帮助下直接使用PulseAudio,在一个shell中执行28秒并在其他shell中执行20个tada.wav for循环 - 与沉默相比,读取字节数增加1.8倍。
为了便于比较,这里是使用Qt 5.6.1在Windows 10中构建的这个程序的输出(我只将音频设备名称和声音播放命令替换为VLC):
===> TEST MONITOR SILENCE
Microphone audio input device: "Microphone (2- Realtek High Definition Audio)"
Monitor audio input device: "Stereo mixer (2- Realtek High Definition Audio)"
Bytes read from microphone: 437472
Bytes read from monitor: 437472
Difference between bytes read from microphone and monitor: 0
5.08 seconds elapsed
===> TEST MONITOR LOT OF TADA
Microphone audio input device: "Microphone (2- Realtek High Definition Audio)"
Monitor audio input device: "Stereo mixer (2- Realtek High Definition Audio)"
Bytes read from microphone: 437472
Bytes read from monitor: 437472
Difference between bytes read from microphone and monitor: 0
5.051 seconds elapsed
===> TEST MONITOR FEW TADA
Microphone audio input device: "Microphone (2- Realtek High Definition Audio)"
Monitor audio input device: "Stereo mixer (2- Realtek High Definition Audio)"
Bytes read from microphone: 437472
Bytes read from monitor: 437472
Difference between bytes read from microphone and monitor: 0
5.03 seconds elapsed
----
Difference ratio (lot of tada / silence): nan
Difference ratio (one tada / silence): nan
正如你在这里看到的那样,在读取的字节数之间绝对没有区别,或者正如我在其他运行中看到的那样 - 在一个readyRead发射中可能存在与缓冲区大小相当的非常小的差异,这仅仅是因为种类情况造成的麦克风和立体声混音器信号之间。
有人建议如何在Qt和Ubuntu中解决ALSA中描述的问题吗?这种奇怪的行为会在我的真实项目中引起AEC代码的麻烦。